 |
|
 |
|
|
 |
Specifying the ID when creating a message queue |
 |
 |
|
|
02-03-05 12:52 PM
Hi all,
Is it possible to create a message queue with a specific ID in C? I
want to do this because I'm trying to write a piece of software which
restores communicating processes (which communicate through message
queues) when there is a machine failure. When restarting the machine I
need to setup the message queues as they were originally.
I find that given the same key, when creating a queue, the queue ID is
not always the same value. Is there any way to specify this value? The
only solution I can think of at the moment is to store the original
queue ID as a variable and repeatedly create queues until the original
message queue ID is used.
Thanks in advance.
[ Post a follow-up to this message ]
|
|
|
 |
|
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
 |
|
 |
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
02-05-05 10:49 PM
Thanks Jens.
I think I need the queue ID because when a process is restored whilst
it was still running, it will have no knowledge that the queue ID has
changed. So how does the process access the queue if it is not aware of
the queue ID change? When restoring foreign applications, there is no
way to force them to use ftok to get the new queue ID.
I understand that if the ID is already in use, then a new queue cannot
be created with that ID, so I think this just might have to be a
limitation of the software.
[ Post a follow-up to this message ]
|
|
|
 |
|
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
 |
|
 |
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
02-09-05 10:57 PM
Jens.Toerring@physik.fu-berlin.de wrote:
> Sorry, but I don't understand what "a process is restored whilst it
was
> still running" is supposed to mean...
I'm sorry if I haven't explained things clearly. I'm trying to write a
program that takes checkpoints of other running (communicating)
processes. At the moment only message queues are being considered. The
idea is that, particularly
for long-running applications the user can save intermediate states of
his running programs, to be restored at some later time. Incidentally
this should cater for failure by allowing the user to restore the last
successful checkpoint. What I meant to say in the last post is that
"processes are periodically checkpointed as they run", i.e. the state
of the communicating processes are saved, including the queues. The
queues need to be saved as well, because a user should be able to stop
their computation (and subsequently clean up any queues) and restore
from the checkpoint. This could be useful in the case where a system
needs to be shutdown for maintenance.
> When one or all of your processes using the message queue die and get
> restarted the message queue is still there, and getting at it with
the
> common key should work just fine. The message queue only vanishes
when
> it gets either actively deleted (using msgctl() with IPC_RMID or the
> ipcrm utility) or when the machine is rebooted - if none of this
> happens it will still be there, even when all the processes that used
> it are dead (message queues like shared memory or semaphores are
stored
> in the kernel, they don't belong to a specific process).
The software being written tries to cater for machine failure, so an
instance where the machine requires rebooting would not be unusual. It
is very likely that the message queues would be destroyed.
> Well, a process waiting on a queue that got deleted will return with
> an error and errno set to EIDRM. All further accesses to the deleted
> queue should result in an error with errno set to EINVAL. That way
> the application can figure out that the queue got removed behind its
> back. But what should change the queue ID in the first place?
> How do these foreign applications get at the message queue at all?
> They use either a well-known key to specify which queue they want or
> IPC_PRIVATE if they always create new ones. No non-braindead applica-
> tion will ever care about the ID of the queue.
A checkpoint represents a program's state, which was recorded at some
point as it ran. If a program has created a queue with a key, to send
or receive from that queue, the ID is needed. When the program is
restored from a checkpoint the newly restored process is going to use
the same ID as it did when the checkpoint was taken. Therefore the
recreated queue requires the same ID for the given key. The alternative
to this would be to change the value of the ID in the restored process
to the ID of the newly replaced message queue, however this isn't a
direction I want to pursue.
> If the foreign application is creating a new message queue your
programs
> are accessing and the foreign application dies and you want to
restart
> it while you keep your programs running then you must delete the old
> message queue after the foreign application got restarted and created
> a new one and handle failure to read from or write to the old queue
in
> your programs - have them access the new one instead. But I probably
> still do not understand what exactly the problem is you are trying to
> solve...
> Regards, Jens
> --
> \ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
> \__________________________ http://www.toerring.de
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
02-09-05 10:57 PM
Jens.Toerring@physik.fu-berlin.de wrote:
> Sorry, but I don't understand what "a process is restored whilst it
was
> still running" is supposed to mean...
I'm sorry if I haven't explained things clearly. I'm trying to write a
program that takes checkpoints of other running (communicating)
processes. At the moment only message queues are being considered. The
idea is that, particularly
for long-running applications the user can save intermediate states of
his running programs, to be restored at some later time. Incidentally
this should cater for failure by allowing the user to restore the last
successful checkpoint. What I meant to say in the last post is that
"processes are periodically checkpointed as they run", i.e. the state
of the communicating processes are saved, including the queues. The
queues need to be saved as well, because a user should be able to stop
their computation (and subsequently clean up any queues) and restore
from the checkpoint. This could be useful in the case where a system
needs to be shutdown for maintenance.
> When one or all of your processes using the message queue die and get
> restarted the message queue is still there, and getting at it with
the
> common key should work just fine. The message queue only vanishes
when
> it gets either actively deleted (using msgctl() with IPC_RMID or the
> ipcrm utility) or when the machine is rebooted - if none of this
> happens it will still be there, even when all the processes that used
> it are dead (message queues like shared memory or semaphores are
stored
> in the kernel, they don't belong to a specific process).
The software being written tries to cater for machine failure, so an
instance where the machine requires rebooting would not be unusual. It
is very likely that the message queues would be destroyed.
> Well, a process waiting on a queue that got deleted will return with
> an error and errno set to EIDRM. All further accesses to the deleted
> queue should result in an error with errno set to EINVAL. That way
> the application can figure out that the queue got removed behind its
> back. But what should change the queue ID in the first place?
> How do these foreign applications get at the message queue at all?
> They use either a well-known key to specify which queue they want or
> IPC_PRIVATE if they always create new ones. No non-braindead applica-
> tion will ever care about the ID of the queue.
A checkpoint represents a program's state, which was recorded at some
point as it ran. If a program has created a queue with a key, to send
or receive from that queue, the ID is needed. When the program is
restored from a checkpoint the newly restored process is going to use
the same ID as it did when the checkpoint was taken. Therefore the
recreated queue requires the same ID for the given key. The alternative
to this would be to change the value of the ID in the restored process
to the ID of the newly replaced message queue, however this isn't a
direction I want to pursue.
> If the foreign application is creating a new message queue your
programs
> are accessing and the foreign application dies and you want to
restart
> it while you keep your programs running then you must delete the old
> message queue after the foreign application got restarted and created
> a new one and handle failure to read from or write to the old queue
in
> your programs - have them access the new one instead. But I probably
> still do not understand what exactly the problem is you are trying to
> solve...
> Regards, Jens
> --
> \ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
> \__________________________ http://www.toerring.de
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
 |  |  |  |  |
 |
 |
|
Jens.Toerring@physik.fu-berlin.de |
|
|
 |
 |


 |
 |
 |
|  |  |  |  |
|
02-10-05 10:58 PM
Tingo <ting.hau@gmail.com> wrote:
> Jens.Toerring@physik.fu-berlin.de wrote:
> was
[vbcol=seagreen]
> I'm sorry if I haven't explained things clearly. I'm trying to write a
> program that takes checkpoints of other running (communicating)
> processes.
I was already fearing something like this. Checkpointing can be
extremely difficult...
> At the moment only message queues are being considered. The
> idea is that, particularly
> for long-running applications the user can save intermediate states of
> his running programs, to be restored at some later time. Incidentally
> this should cater for failure by allowing the user to restore the last
> successful checkpoint. What I meant to say in the last post is that
> "processes are periodically checkpointed as they run", i.e. the state
> of the communicating processes are saved, including the queues. The
> queues need to be saved as well, because a user should be able to stop
> their computation (and subsequently clean up any queues) and restore
> from the checkpoint. This could be useful in the case where a system
> needs to be shutdown for maintenance.
Here the problem is that the message queue doesn't belong to one of
the processes. It's in the kernel and is independent from any of the
processes once it has been created. And as far as I can see saving
a message queue would involve reading all message, which get destroyed
while doing that, so they wont be available for the processes anymore.
To save a message queue you probably have to put all proceesses that
use it to sleep (otherwise some of the processes might get in the way
while you try to save it, changing it) read all messages from it. Then
recreate the queue by resending the messages in the same sequence - but
there's still the problem that you can't set all the fields of the
structure associated with the message queue to the original values and
if one of the programs relies on these fields it may not work correctly.
Finally you have to wake up all the programs, probably saving their
current state at that moment (that seems to be the only moment when it
can done).
Since the timing here is non-deterministic (you never know when a process
is going to read from the message queue and there might be situations
where two processes want to read the same message at the same time) what
happens afterwards in the competing processes depends on which of them
comes first) you can't guarantee that things work exactly the same way
after all programs are restarted from one of the checkpoints. Did you
consider what happens when you restart from the same ceckpoint twice and
in the first case process A gets the message it's competing for with
process B but in the second case, due to slight timing differences,
process B gets it instead? Is that acceptable?
I guess there's a lot of headache coming your way to get that right
under all possible circumstances...
> the common key should work just fine. The message queue only vanishes
> stored
[vbcol=seagreen]
> The software being written tries to cater for machine failure, so an
> instance where the machine requires rebooting would not be unusual. It
> is very likely that the message queues would be destroyed.
If the machine gets rebooted the message queue doesn't exist anymore.
Definitely.
[vbcol=seagreen]
[vbcol=seagreen]
> A checkpoint represents a program's state, which was recorded at some
> point as it ran. If a program has created a queue with a key, to send
> or receive from that queue, the ID is needed. When the program is
> restored from a checkpoint the newly restored process is going to use
> the same ID as it did when the checkpoint was taken. Therefore the
> recreated queue requires the same ID for the given key. The alternative
> to this would be to change the value of the ID in the restored process
> to the ID of the newly replaced message queue, however this isn't a
> direction I want to pursue.
Why? It can easily use instead the ID it gets when it uses the same key.
And as far as I can see, that's the only sane approach. Since you have
to do a lot of work on restart anyway (you've got to reopen files and
put the postion to the correct places, you have to reallocate all the
memory needed etc. etc.), so that would be only a small additional task.
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Specifying the ID when creating a message queue |
 |
 |
|
|
02-16-05 10:57 PM
Thank you for taking the time to reply to my posts. I've taken on board
the issues you've raised and I think I have a good idea for an
implementation.
Thanks again.
[ Post a follow-up to this message ]
|
|
|
 |
|
|
|
|
Sponsored Links |
 |
 |
|
|
 |
All times are GMT. The time now is 11:12 AM. |
 |
|
|
 |
|
 |
|
|
 |
|
Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
|
|
|
|
Medical and Health forum | Computer Games Reviews | Graphics design forum
|
 |
|
 |
|