|
Home > Archive > AOL Webserver > January 2006 > PostgreSQL near-lockups
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
PostgreSQL near-lockups
|
|
| Dave Siktberg 2006-01-16, 8:45 pm |
| Janine Sisk recently wrote that she restarts her AOLservers every night to
help prevent lockups. I'd like to do that, but often when I do a restart I
get several PostgreSQL threads that chew up nearly all the cpu cycles for 30
minutes or more and effectively block access to my site. It appears to be a
shorter wait if I do a restart each day, but I'm loathe to risk making the
site unavailable for a long time even in the wee hours.
Any idea what could cause this? How to fix it? A year ago I did not
observe this behavior.
Dave Siktberg
| |
| Janine Sisk 2006-01-16, 8:45 pm |
| We have found that some sites, when restarted with "svc -t", go into
a funky half-shut-down state and stay there. I don't know why, and
it seems to be very consistently some sites (all using PG) and not
others. For those sites we use "svc -k", in other words send the
kill signal instead of the terminate signal. If you have things set
up properly it doesn't really matter, the site should come back up
either way. I don't know if this is your problem or not, but it's
worth a try.
janine
On Jan 16, 2006, at 5:18 PM, Dave Siktberg wrote:
> Janine Sisk recently wrote that she restarts her AOLservers every
> night to
> help prevent lockups. I'd like to do that, but often when I do a
> restart I
> get several PostgreSQL threads that chew up nearly all the cpu
> cycles for 30
> minutes or more and effectively block access to my site. It
> appears to be a
> shorter wait if I do a restart each day, but I'm loathe to risk
> making the
> site unavailable for a long time even in the wee hours.
>
> Any idea what could cause this? How to fix it? A year ago I did not
> observe this behavior.
>
> Dave Siktberg
>
>
> --
> AOLserver - http://www.aolserver.com/
>
> To Remove yourself from this list, simply send an email to
> <listserv@listserv.aol.com> with the
> body of "SIGNOFF AOLSERVER" in the email message. You can leave the
> Subject: field of your email blank.
| |
| Dave Siktberg 2006-01-17, 2:45 am |
| Thanks! I do use "svc -t" when restarting, so I will try -k and observe
what happens. I'll also now look more carefully at the logs -- I think
there are some clues I haven't yet picked up.
Dave
From: "Janine Sisk" <janine@FURFLY.NET>
Sent: Monday, January 16, 2006 9:04 PM
[vbcol=seagreen]
> We have found that some sites, when restarted with "svc -t", go into
> a funky half-shut-down state and stay there. I don't know why, and
> it seems to be very consistently some sites (all using PG) and not
> others. For those sites we use "svc -k", in other words send the
> kill signal instead of the terminate signal. If you have things set
> up properly it doesn't really matter, the site should come back up
> either way. I don't know if this is your problem or not, but it's
> worth a try.
>
> janine
>
> On Jan 16, 2006, at 5:18 PM, Dave Siktberg wrote:
>
| |
| Don Baccus 2006-01-17, 5:45 pm |
| On Monday 16 January 2006 08:17 pm, Dave Siktberg wrote:
> Thanks! I do use "svc -t" when restarting, so I will try -k and observe
> what happens. I'll also now look more carefully at the logs -- I think
> there are some clues I haven't yet picked up.
Enable PG logging and examine those logs, as well.
--
Don Baccus
Portland, OR
http://donb.furfly.net, http://birdnotes.net, http://openacs.org
| |
| Fred Cox 2006-01-17, 8:45 pm |
| Let's not forget that properly operating software
doesn't require a -k, since it won't get a chance to
clean up pid files and the like.
This should only be a temporary hack while someone
determines what's really happening.
Fred
--- Don Baccus <dhogaza@PACIFIER.COM> wrote:
> On Monday 16 January 2006 08:17 pm, Dave Siktberg
> wrote:
> will try -k and observe
> at the logs -- I think
>
> Enable PG logging and examine those logs, as well.
>
> --
> Don Baccus
> Portland, OR
> http://donb.furfly.net, http://birdnotes.net,
> http://openacs.org
>
>
> --
> AOLserver - http://www.aolserver.com/
>
> To Remove yourself from this list, simply send an
> email to <listserv@listserv.aol.com> with the
> body of "SIGNOFF AOLSERVER" in the email message.
> You can leave the Subject: field of your email
> blank.
>
________________________________________
__________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
| |
| Tom Jackson 2006-01-17, 8:45 pm |
| On Tuesday 17 January 2006 15:41, Fred Cox wrote:
> Let's not forget that properly operating software
> doesn't require a -k, since it won't get a chance to
> clean up pid files and the like.
>
You don't need pid files. All other files will be closed. I doubt there is any
reason to use -t, since it seems to imply that the low-level C code will
somehow know where to stop.
However, if there really are reasons and situations that anyone can think of,
please post them here so everyone can keep them in mind. I have never seen a
list.
> This should only be a temporary hack while someone
> determines what's really happening.
If you want a slightly different alternative, try -t, wait a few seconds for
most everything to stop, then do a -k, but this behavior has been around for
a long time, mostly because people use -t.
tom jackson
| |
| Janine Sisk 2006-01-18, 2:45 am |
| On Jan 17, 2006, at 5:51 PM, Tom Jackson wrote:
> If you want a slightly different alternative, try -t, wait a few
> seconds for
> most everything to stop, then do a -k, but this behavior has been
> around for
> a long time, mostly because people use -t.
The thing I don't understand is why this happens to some sites, while
others can be restarted with -t all day long and they will never
hang. It seems to hint at there being something wrong with the few
sites afflicted by this, doesn't it?
janine
| |
| Dave Siktberg 2006-01-18, 2:45 am |
| ----- Original Message -----
From: "Don Baccus" <dhogaza@pacifier.com>
> Enable PG logging and examine those logs, as well.
I've taken a break to play with some PG logging on my development machine
before doing the same on production -- getting some better instrumentation
will be like turning on the light in a dark room, I hope. The only logging
I had been doing was in the AOLServer error log, where I can see all the SQL
commands as they execute.
Am I on the right track with the following? (I'm running RedHat 7.3 and
PostgreSQL 7.1)
To turn on logging, I've converted the launching command in
/etc/init.d/postgresql from
/usr/bin/pg_ctl -D $PGDATA -p /usr/bin/postmaster start > /dev/null
2>&1
to
/usr/bin/pg_ctl -D $PGDATA -p /usr/bin/postmaster start -l pg-logfile
2>&1
and then added these to postgresql.conf:
log_connections = on
log_pid = on
log_timestamp = on
debug_level = 2 (I've played with 0, 1 and 2 so far)
Any other approaches that I should look into? Any advice on settings that
will be most informative?
Thanks!
Dave
| |
| Bas Scheffers 2006-01-18, 2:45 am |
| On 18 Jan 2006, at 04:15, Janine Sisk wrote:
> The thing I don't understand is why this happens to some sites,
> while others can be restarted with -t all day long and they will
> never hang. It seems to hint at there being something wrong with
> the few sites afflicted by this, doesn't it?
These sites aren't running one of those older 4.x versions that don't
actually stop when told to. (ie: requiring a second ctrl-c when
running in the foreground)
Just a thought.
Bas.
| |
| Don Baccus 2006-01-18, 5:45 pm |
| On Tuesday 17 January 2006 09:05 pm, you wrote:
> Am I on the right track with the following?
I think so, yes. The #1 question to answer is "are the long-running threads
caused by an application query, or something internal to Postgres?"
--
Don Baccus
Portland, OR
http://donb.furfly.net, http://birdnotes.net, http://openacs.org
| |
| Janine Sisk 2006-01-18, 5:45 pm |
| On Jan 17, 2006, at 11:40 PM, Bas Scheffers wrote:
> On 18 Jan 2006, at 04:15, Janine Sisk wrote:
> These sites aren't running one of those older 4.x versions that
> don't actually stop when told to. (ie: requiring a second ctrl-c
> when running in the foreground)
No, at least one of my sites that does it is running OpenACS 3.x.
janine
|
|
|
|
|