04-20-05 10:52 PM
Hello Lawrie,
> I should have mention that my application sends periodic heartbeat
> messages.
Are you heartbeating your remote app? Great!
With TCP?
Do you have a redundant network path?
> I can monitor the state of the connection to the remote host
> by examining the value of errno (if the write fails) and take the
> relevant action.
Assuming that you are using TCP, the reason why the hearbeat fail could be:
(1) The remote host has crashed (for instance, kernel panic, sudden
power off etc.)
(2) The remote app has crashed (e.g. SIGSEGV), but the remote host is OK.
(3) The remote app has closed or shutdowned the socket.
(4) There is a networking failure (network cable plugged out, NIC
problem that didn't led to a kernel panic etc.)
Unless you are using redundant network path, you have no mean to
distinguish those cases. However, you might tell whether it is case 1) -
4), or 2) - 3).
case 1) and 4) can be (only) detected by a timeout, whereas case 2) and
3) can be detected with the EPIPE error condition...
Regards,
Loic.
[ Post a follow-up to this message ]
|