WebSphere Application Server - Thread DeadLock Monitoring

This is Interesting: Free IT Magazines  
Home > Archive > WebSphere Application Server > August 2006 > Thread DeadLock Monitoring





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Thread DeadLock Monitoring
Vinoth S

2006-08-25, 7:38 am

Hi All,
We have 4 applications deployed in WebpSphere Application Server 5.1.1 in AIX Platform in cluster environments(3 WAS Servers).For past 15 days we encounter frequent downtime of these servers in random turns.
In Meantime wat i inferred is,
1) Any one of server encounters thread deadlock.
2) The above leads to the clogging of the httpd process and reaches the maximum,affecting the other applications too.
3) On stop and start of the affected server the issue is resolved and httpd count returns to normal.
4) there is no hit for Hung threads in our logs.

Please let me know , is there any way i could monitor the threads going in deadlock (not hung) so that i can pro-actively stop and start the server avoiding unnecessary downtime.
May be indirectly i can monitor the httpd count,but still i may not be able to identify the exact server.

Thanks in Advance.

Note:
PMR was raised.IBM came back saying it was issue with the JIT ,asked to apply the latest fixes for Application server and as well as for the JVM.

Rgds
Vinoth
Paul Ilechko

2006-08-25, 1:31 pm

Vinoth S wrote:
> Hi All,
> We have 4 applications deployed in WebpSphere Application Server 5.1.1 in AIX Platform in cluster environments(3 WAS Servers).For past 15 days we encounter frequent downtime of these servers in random turns.
> In Meantime wat i inferred is,
> 1) Any one of server encounters thread deadlock.
> 2) The above leads to the clogging of the httpd process and reaches the maximum,affecting the other applications too.
> 3) On stop and start of the affected server the issue is resolved and httpd count returns to normal.
> 4) there is no hit for Hung threads in our logs.
>
> Please let me know , is there any way i could monitor the threads going in deadlock (not hung) so that i can pro-actively stop and start the server avoiding unnecessary downtime.
> May be indirectly i can monitor the httpd count,but still i may not be able to identify the exact server.


You need to take multiple thread dumps and compare them, see which
threads are not moving.
Vinoth S

2006-08-25, 1:31 pm

> Vinoth S wrote:
> Application Server 5.1.1 in AIX Platform in cluster
> environments(3 WAS Servers).For past 15 days we
> encounter frequent downtime of these servers in
> random turns.
> process and reaches the maximum,affecting the other
> applications too.
> issue is resolved and httpd count returns to normal.
> monitor the threads going in deadlock (not hung) so
> that i can pro-actively stop and start the server
> avoiding unnecessary downtime.
> still i may not be able to identify the exact server.
>
> You need to take multiple thread dumps and compare
> them, see which
> threads are not moving.


Thanks paul for the reply,
The problem is that the time taken to shoot up the httpd count is around 15 minutes , so where in i am looking for some kind of online monitoring..
which could instantly identify the dead lock threads.

Rgds
Vinoth
Paul Ilechko

2006-08-25, 7:31 pm

Vinoth S wrote:

>
> Thanks paul for the reply, The problem is that the time taken to
> shoot up the httpd count is around 15 minutes , so where in i am
> looking for some kind of online monitoring.. which could instantly
> identify the dead lock threads.


What you need to do is figure out the underlying problem, not work
around it by restarting servers. You clearly have a problem in your
application. You can purchase a monitoring tool like IBM's ITCAM that
can do lock analysis and provide you with detailed information as to
where things are stuck. You can also, as I suggested, take thread dumps
and see where problems are occurring. I don't understand your response
to that suggestion.

However, you need to solve this problem in a TEST environment, not in
Production, as the kinds of debugging tools you need to use are very
intrusive and will kill performance.
Ben_

2006-08-25, 7:31 pm


"Vinoth S" <vinothS@hcl.in> wrote in message
news:165713184.1156496381844.JavaMail.wassrvr@ltsgwas010.sby.ibm.com...
> Hi All,
> We have 4 applications deployed in WebpSphere Application Server 5.1.1 in

AIX Platform in cluster environments(3 WAS Servers).For past 15 days we
encounter frequent downtime of these servers in random turns.
> In Meantime wat i inferred is,
> 1) Any one of server encounters thread deadlock.
> 2) The above leads to the clogging of the httpd process and reaches the

maximum,affecting the other applications too.

You can tune the web server plugin configuration to avoid contagion effect.
I don't have exact information of what I did at hand, but you could tune
ConnectTimeOut or MaxConnections.
Basically, the idea is to rapidly return a failure to the browser (HTTP 500)
instead of letting all web server threads block waiting because of a single
Application Server failing / blocking. If you have only one Application
Server with several Enterprise Applications behind the web server, this
won't help and you'll need to evaluate how to isolate the application if
possible (if the application supports it and if time to resolution permits).

> 3) On stop and start of the affected server the issue is resolved and

httpd count returns to normal.
When you enable server-server on Apache/IHS, you can see what each web
server thread is serving as HTTP request. It may help.
http://httpd.apache.org/docs/2.0/mod/mod_status.html

> 4) there is no hit for Hung threads in our logs.

The threshold for the warning message is 10 minutes.
If the problem is rapidly detected by the Operators, you may be missing the
warning.
You could try to adapt the threshold
(http://publib.boulder.ibm.com/infoc...com.ibm.websphe
re.base.doc/info/aes/ae/ttrb_confighangdet.html).

Or there may very well be no hang or deadlock, but "simply" an overall drop
in performance and an increase in response times (e.g. back-end reply in 1
minute instead of 1 second).

>
> Please let me know , is there any way i could monitor the threads going in

deadlock (not hung) so that i can pro-actively stop and start the server
avoiding unnecessary downtime.

I don't know of such feature. Thread deadlock detection or avoidance is more
a Java issue than a WebSphere issue. And AFAIK, there is no way to detect
it. Hence WebSphere hung thread detection feature.

> May be indirectly i can monitor the httpd count,but still i may not be

able to identify the exact server.

You can log the web server thread pool usage with mod_mpmstats
(http://publib.boulder.ibm.com/https...d_mpmstats.html) and
attach a log tailer to alert when a threshold is exceeded.

>
> Thanks in Advance.
>
> Note:
> PMR was raised.IBM came back saying it was issue with the JIT ,asked to

apply the latest fixes for Application server and as well as for the JVM.
I don't see the relation between JIT and the problem you depict. But
continue to work with them.
>
> Rgds
> Vinoth


As Paul indicated, thread dump comparison is a good place to start.


Vinoth S

2006-08-30, 7:37 am

Hi Ben,
Thanks for the gr8 help
I have finally managed find some round about way to find the hang or deadlock of the threads in JVM using the following commands in the AIX

ps -fp 30450
ps -mp [process id] -o THREAD

Thanks & rgds
Vinoth

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com