|
Home > Archive > Web Servers on Unix and Linux > May 2004 > long-lived httpd process, cpu hog
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
long-lived httpd process, cpu hog
|
|
| plamendp 2004-05-14, 3:37 pm |
| Often I found my FreeBSD (4-STABLE) box (apache 1.3.29 + modssl +
php4) in the following situation:
1. top(1) reports almost 0% idle.
2. top(1) reports one single httpd process eating 70-80% CPU with
NICE somewhere about 60
3. ps(1) reports one single (the same from 2.) httpd process runing
since 3-4 hours (aPeriod)
4.apache /server-status reports this process as being in "W" status
(sending reply) with "SS" aprox. equal to aPeriod from 3. The Request
seems normal, nothing suspicious
5. netstat(1) continuesly reports NO connection with the Reuqesting
peer at the moment (the Host from /server-status)
kill-ing the mentioned httpd process solves the situation and
everything is OK.
I've read all logs. Could not find any clue. No php errors (logged),
no apache_error_log messages like sig* exiting, core dumps etc.
Any ideas?
Anything related to KeepAlive option?
regards
| |
| Juha Laiho 2004-05-15, 5:34 am |
| plamendp@bgstore.com (plamendp) said:
>Often I found my FreeBSD (4-STABLE) box (apache 1.3.29 + modssl +
>php4) in the following situation:
>
>1. top(1) reports almost 0% idle.
>2. top(1) reports one single httpd process eating 70-80% CPU with
>NICE somewhere about 60
>3. ps(1) reports one single (the same from 2.) httpd process runing
>since 3-4 hours (aPeriod)
>4.apache /server-status reports this process as being in "W" status
>(sending reply) with "SS" aprox. equal to aPeriod from 3. The Request
>seems normal, nothing suspicious
>5. netstat(1) continuesly reports NO connection with the Reuqesting
>peer at the moment (the Host from /server-status)
You didn't tell whether serving the active request contains any PHP
processing, or whether the requested page was a static one. My guess
is that the request was on a page with some php functionality, and that
the php code executed contains a loop that in some conditions becomes
infinite. At some point after the request has been made, the client has
gotten tired to wait and disconnected.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
| |
| plamendp 2004-05-15, 12:37 pm |
| Juha Laiho <Juha.Laiho@iki.fi> wrote in message news:<c84m5f$fh7$1@ichaos.ichaos-int>...
>
> You didn't tell whether serving the active request contains any PHP
> processing, or whether the requested page was a static one. My guess
> is that the request was on a page with some php functionality, and that
> the php code executed contains a loop that in some conditions becomes
> infinite. At some point after the request has been made, the client has
> gotten tired to wait and disconnected.
Every request generates PHP-ing, all pages are dynamic. But every
request generates PostgreSQL processing (by design), however there was
NO suspicious PostgreSQL process runing at that moment, all
postmasters was idle or runing since 2-3 seconds, that's normal. So, I
could not say if php is doing some infinite looping somewhere... Could
I ?
Frankly, I am not a *nix guru.
Right NOW I see again 3 runing processes... well.. here is the ps
output (see the first 3 procs):
XXXXX # ps -axvr -U nobody
PID STAT TIME SL RE PAGEIN VSZ RSS LIM TSIZ %CPU %MEM
COMMAND
58851 R 51:24.02 0 10390 62 17396 9164 - 276 28.7 1.8
/usr/local/sbin/httpd
57496 R 79:19.36 0 14329 24 18172 10040 - 276 28.7 1.9
/usr/local/sbin/httpd
51247 R 333:46.39 0 32460 23 19508 11248 - 276 29.2 2.2
/usr/local/sbin/httpd
51250 S 1:31.50 9 32460 196 23720 15068 - 276 0.0 2.9
/usr/local/sbin/httpd
51251 S 1:36.42 1 32460 150 23860 15260 - 276 0.0 3.0
/usr/local/sbin/httpd
51252 S 1:22.02 9 32459 213 24484 15800 - 276 0.0 3.1
/usr/local/sbin/httpd
51259 S 1:29.09 1 32456 151 23592 14956 - 276 0.3 2.9
/usr/local/sbin/httpd
51260 S 1:26.09 1 32455 197 23856 15172 - 276 0.1 2.9
/usr/local/sbin/httpd
51261 S 1:29.71 7 32455 158 23924 15152 - 276 0.0 2.9
/usr/local/sbin/httpd
51262 S 1:18.46 7 32453 213 23732 15120 - 276 0.0 2.9
/usr/local/sbin/httpd
51265 S 1:25.67 1 32452 184 23716 15128 - 276 0.0 2.9
/usr/local/sbin/httpd
62195 S 0:01.19 13 129 0 23384 14688 - 276 0.0 2.8
/usr/local/sbin/httpd
62196 S 0:00.37 3 124 0 17180 8484 - 276 0.0 1.6
/usr/local/sbin/httpd
62271 S 0:00.01 1 5 0 15220 6164 - 276 0.0 1.2
/usr/local/sbin/httpd
62274 S 0:00.01 0 4 0 15268 6232 - 276 0.0 1.2
/usr/local/sbin/httpd
62275 S 0:00.00 1 4 0 15180 6100 - 276 0.0 1.2
/usr/local/sbin/httpd
62282 S 0:00.00 1 2 0 15180 6100 - 276 0.0 1.2
/usr/local/sbin/httpd
51248 S 1:23.90 7 32460 166 23864 15168 - 276 0.0 2.9
/usr/local/sbin/httpd
51249 S 1:36.39 1 32460 160 24412 15720 - 276 0.0 3.0
/usr/local/sbin/httpd
Also:
* All PostgreSQL processes looks normal.
* Apache/server-status reports those 3 procs at "W" status
* netstat(1) shows NO, definitely NO connection to hosts reported by
/servers-status
BTW, the web site responce is pritty good: no delays, no hickups.. I
mean, the box behaves pritty much as in a normal situation. I can
remember when it was at 0% idle state because of resource hoging:
hard to log in, web server laging, etc. Nothing like that at the
moment!
Huh.. will continue later... keep investigating this strange things
regards
| |
| Juha Laiho 2004-05-16, 5:35 am |
| plamendp@bgstore.com (plamendp) said:
>Every request generates PHP-ing, all pages are dynamic. But every
>request generates PostgreSQL processing (by design), however there was
>NO suspicious PostgreSQL process runing at that moment, all
>postmasters was idle or runing since 2-3 seconds, that's normal. So, I
>could not say if php is doing some infinite looping somewhere... Could
>I ?
>
>Frankly, I am not a *nix guru.
OS knowledge is not much needed here. Now, by telling that you're not seeing
any significant activity on your database backend, you've localised the
problem to be either a bug in your php code producing endless loops, or
a bug in core Apache producing endless loops. While it is not impossible
for this to be an Apache bug, I consider a bug in your php code to be more
probable.
>Right NOW I see again 3 runing processes... well.. here is the ps
>output (see the first 3 procs):
>
>XXXXX # ps -axvr -U nobody
> PID STAT TIME SL RE PAGEIN VSZ RSS LIM TSIZ %CPU %MEM
>COMMAND
>58851 R 51:24.02 0 10390 62 17396 9164 - 276 28.7 1.8
>/usr/local/sbin/httpd
>57496 R 79:19.36 0 14329 24 18172 10040 - 276 28.7 1.9
>/usr/local/sbin/httpd
>51247 R 333:46.39 0 32460 23 19508 11248 - 276 29.2 2.2
>/usr/local/sbin/httpd
You might take a look to see whether all these are serving a request to
the same resource, or whether the php code in these includes some common
module of yours. Or perhaps there's just a programming error that's
common to many of your pages. Find all places where you loop, and think
whether the loops could in some circumstances (perhaps not the normal
ones, but the exceptional ones) become infinite.
>* Apache/server-status reports those 3 procs at "W" status
>* netstat(1) shows NO, definitely NO connection to hosts reported by
>/servers-status
Well, there have been onnections, but as I was guessing earlier, for
a reason or another the client connecctions have been broken - and
apparently in some such way that your code has been left running.
>BTW, the web site responce is pritty good: no delays, no hickups.. I
>mean, the box behaves pritty much as in a normal situation.
Apache does spawn additional child processes to serve requests; the
three that are "spinning" are from Apache point of view serving requests,
and thus no new requests are being directed to them.
>I can remember when it was at 0% idle state because of resource hoging:
>hard to log in, web server laging, etc. Nothing like that at the
>moment!
100% CPU utilisation doesn't often slow a machine down by much (esp.
for WWW content serving, which most often is not _that_ CPU intensive;
the issue is different in f.ex. technical computation side where there
still can be simulation jobs really using 100% CPU for several days
in a single run -- with these, if you have 2-3 other processes competing
for CPU, the run time will grow to 3-4 weeks for the long-running job).
For generating a WWW page component you shouldn't typically be using
even 0.1 seconds of CPU, so the overall page load times do not grow
by that much.
There are other types of resource contention situations where you could
well see 100% CPU utilisation, but the actual problem is something else
(severe memory shortage: CPU time used for paging things in and out, much
of wallclock time actually used for disk transfers; "fork bombs": tens
or hundreds of processes competing for CPU time; ...).
>Huh.. will continue later... keep investigating this strange things
You might try killing the spinning processes by hand (plain "kill",
not "kill -9" -- they might be able to log something of what they do.
At that time you should also get log lines of these requests to the
Apache acess_log and error_log.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
|
|
|
|
|