Voice over IP Cisco - troubleshooting IP phone 7960 keepalives

This is Interesting: Free IT Magazines  
Home > Archive > Voice over IP Cisco > November 2005 > troubleshooting IP phone 7960 keepalives





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author troubleshooting IP phone 7960 keepalives
Justin Steinberg

2005-11-16, 5:45 pm

I have some phones that periodically reset themselves at random times
during the day, I even have some instances when the phone reset itself
while on an active call.

When I look at a detailed callmanager trace i see the following entries:

11/15/2005 15:21:29.108 CCM|StationInit - Keep alive timeout.:
000000741|<CLID::xxxCMPUB01-Cluster><NID::10.xxx.xxx.xxx><CT::2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::SEP000C853Bxxxx>
11/15/2005 15:21:29.108 CCM|StationInit - Closing Station connection
DeviceName=SEP000C853Bxxxx, TCPHandle=000000741,
IPAddr=10.xxx.xxx.xxx, Port=52056, Device
Controller=[2,89,737]|<CLID::xxxCMPUB01-Cluster><NID::10.xxx.xxx.xxx><CT::2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::SEP000C853Bxxx>
11/15/2005 15:21:29.108 CCM|DeviceUnregistered - Device unregistered.
Device name.:SEP000C853Bxxx Device IP address.:10.xxx.xxx.xxx Device
type. [Optional]:7 Device description [Optional].:John Johnson 1128
Reason Code [Optional].:8 App ID:Cisco CallManager Cluster
ID:xxxCMPUB01-Cluster Node


It seems like the phone is reporting to the CallManager that it is not
receiving keepalives and so the phone reboots. I noticed that
detailed callmanager traces log the incoming keepalives that the
phones send to CallManager. But I do not see the callManagers
KeepAliveAcks in the detailed CCM trace log. Do I have to run a
packet capture to see this?

Also, the phones and CallManager are on the same lan - so latency/loss
SHOULD not be an issue.

TIA

justin
Wes Sisk

2005-11-16, 5:45 pm

KA Acks were removed from traces long ago.

the more important message to look for is the ALARM message the phone
sends to the failover server (or the next server it tries to register
to, in the case it re-registers back to the same server)

use this:
cd prog*\cisco\trace\ccm
findstr Last= ccm*.txt >l.txt
notepad l.txt

This Last= is the phone's point of view on why it lost registration with
the last CM server. Should be one of these values:
0 Phone Load Is Rejected
1 Phone Load TFTP Size Error
2 Phone Load Compressor Error
3 Phone Load Version Error
4 Disk Full Error
5 Checksum Error
6 Phone Load Not Found in TFTP Server
7 TFTP Timeout
8 TFTP Access Error
9 TFTP Error
10 CCM TCP Connection timeout
11 CCM TCP Connection Close because of bad Ack
12 CCM Resets TCP Connection
13 CCM Aborts TCP Connection
14 CCM TCP Connection Closed
15 CCM TCP Connection Closed because ICMP Unreachable
16 CCM Rejects TCP Connection
17 Keepalive Time Out
18 Fail Back to Primary CCM
20 User Resets Phone By Keypad
21 Phone Resets because IP configuration
22 CCM Resets Phone
23 CCM Restarts Phone
24 CCM Rejects Phone Registration
25 Phone Initializes
26 CCM TCP Connection Closed With Unknown Reason
27 Waiting For State From CCM
28 Waiting For Response From CCM
29 DSP Alarm
30 Phone Abort CCM TCP Connection
31 File Authorization Failed

/Wes



Justin Steinberg wrote:

>I have some phones that periodically reset themselves at random times
>during the day, I even have some instances when the phone reset itself
>while on an active call.
>
>When I look at a detailed callmanager trace i see the following entries:
>
>11/15/2005 15:21:29.108 CCM|StationInit - Keep alive timeout.:
>000000741|<CLID::xxxCMPUB01-Cluster><NID::10.xxx.xxx.xxx><CT::2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::SEP000C853Bxxxx>
>11/15/2005 15:21:29.108 CCM|StationInit - Closing Station connection
>DeviceName=SEP000C853Bxxxx, TCPHandle=000000741,
>IPAddr=10.xxx.xxx.xxx, Port=52056, Device
>Controller=[2,89,737]|<CLID::xxxCMPUB01-Cluster><NID::10.xxx.xxx.xxx><CT::2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::SEP000C853Bxxx>
>11/15/2005 15:21:29.108 CCM|DeviceUnregistered - Device unregistered.
>Device name.:SEP000C853Bxxx Device IP address.:10.xxx.xxx.xxx Device
>type. [Optional]:7 Device description [Optional].:John Johnson 1128
>Reason Code [Optional].:8 App ID:Cisco CallManager Cluster
>ID:xxxCMPUB01-Cluster Node
>
>
>It seems like the phone is reporting to the CallManager that it is not
>receiving keepalives and so the phone reboots. I noticed that
>detailed callmanager traces log the incoming keepalives that the
>phones send to CallManager. But I do not see the callManagers
>KeepAliveAcks in the detailed CCM trace log. Do I have to run a
>packet capture to see this?
>
>Also, the phones and CallManager are on the same lan - so latency/loss
>SHOULD not be an issue.
>
>TIA
>
>justin
>
> ________________________________________
_______
>cisco-voip mailing list
>cisco-voip@puck.nether.net
>https://puck.nether.net/mailman/listinfo/cisco-voip
>
>

Kevin Thorngren

2005-11-16, 5:45 pm

Hi Justin,

Typically when you see the message "StationInit - Keep alive timeout"
in the trace it is coming from CallManager. It seems strange that it
would be a StationInit message but the timeout is coming from the
StationInit process within the CallManager. CCM/SDL traces from the
same time frame may help to confirm this. A packet capture of the
problem would be best though.

At some point, don't remember when, the KeepAliveAck message was taken
out of the CCM trace. One thing you can do is to go backwards in the
trace from the "Keep alive timeout" message and see when the last
KeepAlive from the phone was. I suspect you will find one missing.

HTH,
Kevin

On Nov 16, 2005, at 3:01 PM, Justin Steinberg wrote:

> I have some phones that periodically reset themselves at random times
> during the day, I even have some instances when the phone reset itself
> while on an active call.
>
> When I look at a detailed callmanager trace i see the following
> entries:
>
> 11/15/2005 15:21:29.108 CCM|StationInit - Keep alive timeout.:
> 000000741|<CLID::xxxCMPUB01-Cluster><NID::10.xxx.xxx.xxx><CT::
> 2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::SEP000C853Bxxxx>
> 11/15/2005 15:21:29.108 CCM|StationInit - Closing Station connection
> DeviceName=SEP000C853Bxxxx, TCPHandle=000000741,
> IPAddr=10.xxx.xxx.xxx, Port=52056, Device
> Controller=[2,89,737]|<CLID::xxxCMPUB01-Cluster><NID::
> 10.xxx.xxx.xxx><CT::2,100,90,1.1122347><IP::10.xxx.xxx.xxx><DEV::
> SEP000C853Bxxx>
> 11/15/2005 15:21:29.108 CCM|DeviceUnregistered - Device unregistered.
> Device name.:SEP000C853Bxxx Device IP address.:10.xxx.xxx.xxx Device
> type. [Optional]:7 Device description [Optional].:John Johnson 1128
> Reason Code [Optional].:8 App ID:Cisco CallManager Cluster
> ID:xxxCMPUB01-Cluster Node
>
>
> It seems like the phone is reporting to the CallManager that it is not
> receiving keepalives and so the phone reboots. I noticed that
> detailed callmanager traces log the incoming keepalives that the
> phones send to CallManager. But I do not see the callManagers
> KeepAliveAcks in the detailed CCM trace log. Do I have to run a
> packet capture to see this?
>
> Also, the phones and CallManager are on the same lan - so latency/loss
> SHOULD not be an issue.
>
> TIA
>
> justin
>
> ________________________________________
_______
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>

Jafar T

2005-11-16, 5:45 pm

I had this happen to one of my customers, we found that trunning off the
packet inspection on the firewall for Skinny fixed it, firewalls contexts
may have been confused about the skinny packets that passed through it,
causing the keepalives acknowledgement from the CallManagers to unsuccessfully
getting to the IP phones thus causing the phone to go into a reset mode

----- Original Message -----
From: "Kevin Thorngren"
To: "Justin Steinberg"
Subject: Re: [cisco-voip] troubleshooting IP phone 7960 keepalives
Date: Wed, 16 Nov 2005 15:18:10 -0500


Hi Justin,

Typically when you see the message "StationInit - Keep alive timeout"
in the trace it is coming from CallManager. It seems strange that it
would be a StationInit message but the timeout is coming from the
StationInit process within the CallManager. CCM/SDL traces from the
same time frame may help to confirm this. A packet capture of the
problem would be best though.

At some point, don't remember when, the KeepAliveAck message was
taken
out of the CCM trace. One thing you can do is to go backwards in the
trace from the "Keep alive timeout" message and see when the last
KeepAlive from the phone was. I suspect you will find one missing.

HTH,
Kevin

On Nov 16, 2005, at 3:01 PM, Justin Steinberg wrote:

> I have some phones that periodically reset themselves at random

times
> during the day, I even have some instances when the phone reset

itself
> while on an active call.
>
> When I look at a detailed callmanager trace i see the following

entries:
>
> 11/15/2005 15:21:29.108 CCM|StationInit - Keep alive timeout.:
> 000000741|> 2,100,90,1.1122347>
> 11/15/2005 15:21:29.108 CCM|StationInit - Closing Station

connection
> DeviceName=SEP000C853Bxxxx, TCPHandle=000000741,
> IPAddr=10.xxx.xxx.xxx, Port=52056, Device
> Controller=[2,89,737]|> 10.xxx.xxx.xxx>> SEP000C853Bxxx>
> 11/15/2005 15:21:29.108 CCM|DeviceUnregistered - Device

unregistered.
> Device name.:SEP000C853Bxxx Device IP address.:10.xxx.xxx.xxx

Device
> type. [Optional]:7 Device description [Optional].:John Johnson 1128
> Reason Code [Optional].:8 App ID:Cisco CallManager Cluster
> ID:xxxCMPUB01-Cluster Node
>
>
> It seems like the phone is reporting to the CallManager that it is

not
> receiving keepalives and so the phone reboots. I noticed that
> detailed callmanager traces log the incoming keepalives that the
> phones send to CallManager. But I do not see the callManagers
> KeepAliveAcks in the detailed CCM trace log. Do I have to run a
> packet capture to see this?
>
> Also, the phones and CallManager are on the same lan - so

latency/loss
> SHOULD not be an issue.
>
> TIA
>
> justin
>
> ________________________________________
_______
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>


________________________________________
_______
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

--
________________________________________
___________
Play 100s of games for FREE! http://games.mail.com/


Justin Steinberg

2005-11-16, 5:45 pm

This makes sense. I was confused as to why a small percentage of
phones were reporting to CallManager that they were not receiving
KeepAlive Acks. If they could report that info to CallManager they
should receive the acks....

So to the untrained eye, the 'StationInit' part of those trace messags
are a little misleading because I read it as a message coming from the
phone. I did see one missed KA from the phone. Although, I thought
the phone only resets after three missed KA's. Or maybe, the KA
didn't make it to CallManager and the TCP session broke which caused
the phone to reset quicker.

I will keep an eye on the next occurance to see what alarm message is
generated as Wes suggested.

Justin

On 16/11/05, Kevin Thorngren <kthorngr@cisco.com> wrote:
> Hi Justin,
>
> Typically when you see the message "StationInit - Keep alive timeout"
> in the trace it is coming from CallManager. It seems strange that it
> would be a StationInit message but the timeout is coming from the
> StationInit process within the CallManager. CCM/SDL traces from the
> same time frame may help to confirm this. A packet capture of the
> problem would be best though.
>
> At some point, don't remember when, the KeepAliveAck message was taken
> out of the CCM trace. One thing you can do is to go backwards in the
> trace from the "Keep alive timeout" message and see when the last
> KeepAlive from the phone was. I suspect you will find one missing.
>
> HTH,
> Kevin
>
> On Nov 16, 2005, at 3:01 PM, Justin Steinberg wrote:
>
>
>

Kevin Thorngren

2005-11-16, 5:45 pm

Yes, it is confusing. IIRC, it is three KAs but they aren't all 30
seconds apart. The timeframe is shortened. Typically you will see
around 60 seconds between the last KA and the KA Timeout.

As Wes mentioned you can look at the secondary subscriber's CCM traces
and search for "Last=" around the timeframe of when the phone
unregistered. This will give you the reason for the reset. You don't
need to wait for the Alarm message.

Kevin
On Nov 16, 2005, at 3:39 PM, Justin Steinberg wrote:

> This makes sense. I was confused as to why a small percentage of
> phones were reporting to CallManager that they were not receiving
> KeepAlive Acks. If they could report that info to CallManager they
> should receive the acks....
>
> So to the untrained eye, the 'StationInit' part of those trace messags
> are a little misleading because I read it as a message coming from the
> phone. I did see one missed KA from the phone. Although, I thought
> the phone only resets after three missed KA's. Or maybe, the KA
> didn't make it to CallManager and the TCP session broke which caused
> the phone to reset quicker.
>
> I will keep an eye on the next occurance to see what alarm message is
> generated as Wes suggested.
>
> Justin
>
> On 16/11/05, Kevin Thorngren <kthorngr@cisco.com> wrote:
>

Wes Sisk

2005-11-16, 5:45 pm

________________________________________
_______
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com