Unix Programming - file offsets across fork()

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > June 2006 > file offsets across fork()





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author file offsets across fork()
anoop.vijayan@gmail.com

2006-06-09, 7:23 pm

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
pid_t pid=fork();
if ( pid == 0 )
{
write(1,"I am Child",10);
fflush(stdout);
_exit(0);
}
else if (pid > 0)
{
write(1,"I am Parent",11);
fflush(stdout);
exit(0);
}
else
{
printf("fork failed");
}
}

now, if i run ./a.out > op, in a multiprocessor system, the content of
the file is
I am Childt

Note the 't' at the end of the string. Its obvious that parent data was
overwritten by child.
I want to know if this behaviour is as per UNIX standards.

Thanks
anoop

davids@webmaster.com

2006-06-09, 7:23 pm


anoop.vijayan@gmail.com wrote:

> write(1,"I am Child",10);
> fflush(stdout);


> now, if i run ./a.out > op, in a multiprocessor system, the content of
> the file is
> I am Childt
>
> Note the 't' at the end of the string. Its obvious that parent data was
> overwritten by child.
> I want to know if this behaviour is as per UNIX standards.


No, the result is totally arbitrary. You are combining buffered and
unbuffered I/O the same file descriptor. This causes undefined behaior.

DS

ed

2006-06-09, 7:23 pm

On 9 Jun 2006 14:37:17 -0700
anoop.vijayan@gmail.com wrote:

> now, if i run ./a.out > op, in a multiprocessor system, the content of
> the file is
> I am Childt
>
> Note the 't' at the end of the string. Its obvious that parent data
> was overwritten by child.
> I want to know if this behaviour is as per UNIX standards.


What were you expecting to happen? Both the child and parent open the
*same* file for writing at the *same* time. To solve this, you must use
semaphores to control the write access.

--
Regards, Ed :: http://www.bsdwarez.net
just another bash hacker
Say NO to LONGHORN/VISTA -- google this: "how microsoft is selling
out the public to please hollywood"
anoop.vijayan@gmail.com

2006-06-09, 7:23 pm

ed wrote:

> What were you expecting to happen? Both the child and parent open the
> *same* file for writing at the *same* time. To solve this, you must use
> semaphores to control the write access.
>


As per "Design of Unix OS" by Bach, the file offsets across fork are
shared by parent and child and these process never read or write the
same file offset values. I wanted to know if this holds for a
multiprocessor system

Fletcher Glenn

2006-06-09, 7:23 pm


<anoop.vijayan@gmail.com> wrote in message
news:1149890038.433064.68910@j55g2000cwa.googlegroups.com...
> ed wrote:
>
>
> As per "Design of Unix OS" by Bach, the file offsets across fork are
> shared by parent and child and these process never read or write the
> same file offset values. I wanted to know if this holds for a
> multiprocessor system
>


What you don't seem to understand is that the parent and child have the
same offset a the time of the fork, but this offset is handled independently
by parent and child. One does not update the other's offset. You are only
partly safe by fseek()ing or lseek()ing to the end of the file before
writing.
You are still exposed to one process overwriting the other processes output
unless you use something like file-locking to prevent simultaneous writes.

--

Fletcher Glenn


Gordon Burditt

2006-06-10, 1:25 am

>>> What were you expecting to happen? Both the child and parent open the
>
>What you don't seem to understand is that the parent and child have the
>same offset a the time of the fork, but this offset is handled independently
>by parent and child.


Why? This is a shared file descriptor. The parent and child did
not both open the same file in the original posting. They share
the same already-open file descriptor.

Now, it could be that the fflush(stdout) messed with the file
pointer. You aren't supposed to mix buffered and unbuffered output
on the same file descriptor.

>One does not update the other's offset. You are only
>partly safe by fseek()ing or lseek()ing to the end of the file before
>writing.
>You are still exposed to one process overwriting the other processes output
>unless you use something like file-locking to prevent simultaneous writes.


On a shared file descriptor, I don't see why this is necessary if neither
process is doing any seeking and both are doing write()s.

Gordon L. Burditt
Nils O. Selåsdal

2006-06-10, 1:21 pm

anoop.vijayan@gmail.com wrote:
> ed wrote:
>
>
> As per "Design of Unix OS" by Bach, the file offsets across fork are
> shared by parent and child and these process never read or write the
> same file offset values. I wanted to know if this holds for a
> multiprocessor system

posix guarantees atomic write(2)s on a filedescriptor. You do
have some fflush() statements there which works on FILE* not
filedescriptors (thoug underlying a FILE* is a descriptor)
Mixing these are dangerous.

It should be noted that linux had(has?) a bug regarding just this -
http://lwn.net/Articles/180388/ (same applies to shared fds by fork())

Jolting

2006-06-10, 1:21 pm

The file descriptor isn't incremented till just before write() returns.
anoop.vijayan@gmail.com wrote:
> #include <sys/types.h>
> #include <unistd.h>
> #include <stdio.h>
>
> int main()
> {
> pid_t pid=fork();
> if ( pid == 0 )
> {
> write(1,"I am Child",10);
> fflush(stdout);
> _exit(0);
> }
> else if (pid > 0)
> {
> write(1,"I am Parent",11);
> fflush(stdout);
> exit(0);
> }
> else
> {
> printf("fork failed");
> }
> }
>
> now, if i run ./a.out > op, in a multiprocessor system, the content of
> the file is
> I am Childt
>
> Note the 't' at the end of the string. Its obvious that parent data was
> overwritten by child.
> I want to know if this behaviour is as per UNIX standards.
>
> Thanks
> anoop


Daniel Rock

2006-06-10, 7:21 pm

Jolting <hunterlaux@gmail.com> wrote:
> The file descriptor isn't incremented till just before write() returns.


write() is an atomic function:

http://www.opengroup.org/onlinepubs...ions/write.html
http://www.opengroup.org/onlinepubs...tions/read.html

I/O is intended to be atomic to ordinary files and pipes and FIFOs. Atomic
means that all the bytes from a single operation that started out together
end up together, without interleaving from other I/O operations.

Two write()s of the same file description should never write at the same
offsets. But in the fork() example which process comes first is undefined.

--
Daniel
davids@webmaster.com

2006-06-10, 7:21 pm


Daniel Rock wrote:

> I/O is intended to be atomic to ordinary files and pipes and FIFOs. Atomic
> means that all the bytes from a single operation that started out together
> end up together, without interleaving from other I/O operations.


Where do you see anything that suggests that I/O should be atomic to
ordinary files? That is 100% untrue.

DS

Daniel Rock

2006-06-11, 1:23 am

davids@webmaster.com wrote:
>
> Daniel Rock wrote:
>
>
> Where do you see anything that suggests that I/O should be atomic to
> ordinary files? That is 100% untrue.


It seems 100% true that you don't actually read the standards.

It is a requirement of the standard. The three lines above are cited from
the standard.

--
Daniel
Brian Raiter

2006-06-11, 1:23 am

> write() is an atomic function:
>
> http://www.opengroup.org/onlinepubs...ions/write.html
> http://www.opengroup.org/onlinepubs...tions/read.html
>
> I/O is intended to be atomic to ordinary files and pipes and FIFOs.


I looked at the web page for write quoted above. It says:

An attempt to write to a pipe or FIFO has several major
characteristics: [...] This volume of IEEE Std 1003.1-2001 does
not say whether write requests for more than {PIPE_BUF} bytes are
atomic, but requires that writes of {PIPE_BUF} or fewer bytes
shall be atomic.

The text refers to pipes and FIFOs, but not regular files. I also
found this text, near the bottom:

This volume of IEEE Std 1003.1-2001 does not specify behavior of
concurrent writes to a file from multiple processes.

This seems to be consistent with my own experience. (Namely, that
multiple processes + single file - locking = mish-mash.)

b
davids@webmaster.com

2006-06-11, 1:23 am


Daniel Rock wrote:

> It seems 100% true that you don't actually read the standards.


On the contrary, I read them with extreme care.

> It is a requirement of the standard. The three lines above are cited from
> the standard.


This is contained in a rationale section and as such cannot state a
requirement. If you read the preface, it explains that the rationale
sections are purely informative.

You would not want file reads and writes to be atomic. Imagine if one
process or thread does a huge read from a file and another thread tries
to modify a single byte at the same time. You do not want the single
byte write having to wait until the huge read is finished.

DS

Nils O. Selåsdal

2006-06-11, 7:24 am

davids@webmaster.com wrote:
> Daniel Rock wrote:
>
>
> On the contrary, I read them with extreme care.
>
>
> This is contained in a rationale section and as such cannot state a
> requirement. If you read the preface, it explains that the rationale
> sections are purely informative.
>
> You would not want file reads and writes to be atomic. Imagine if one
> process or thread does a huge read from a file and another thread tries
> to modify a single byte at the same time. You do not want the single
> byte write having to wait until the huge read is finished.


Yet that is how most[1] unixes implements it.

[1] atleast those supporting the posix thread extension, as that has
a hard requirement on atomic and thread safe read/write functions -
which rarly unix kernels handle diffrently wether the descriptor
is shared by threads or processes.

anoop.vijayan@gmail.com

2006-06-11, 1:27 pm

What I observe is that the stream of bytes were written one after the
other, but the offset was the same for both, which effected a data
loss. This implies that the critical section in the write() call does
not take care of the offset. One process already entered the call with
a known offset value and waited for another write to complete which
modified the offset. The former goes unaware of the modified offet and
writes to the old offset. Now, the question is whether that be called
atomic.

- anoop

davids@webmaster.com

2006-06-12, 1:30 am


Nils O. Sel=E5sdal wrote:

> davids@webmaster.com wrote:


[vbcol=seagreen]
> Yet that is how most[1] unixes implements it.


Not in my experience. In my experience, a large write to a file does
not block concurrent reads, even to parts of the file that the write
will cover or has covered. The UNIX way is cooperative locking.

> [1] atleast those supporting the posix thread extension, as that has
> a hard requirement on atomic and thread safe read/write functions -
> which rarly unix kernels handle diffrently wether the descriptor
> is shared by threads or processes.


Where do you find this requirement (for ordinary files)?

DS

Nils O. Selåsdal

2006-06-12, 7:26 am

davids@webmaster.com wrote:
> Nils O. Selåsdal wrote:
>
>
>
>
> Not in my experience. In my experience, a large write to a file does
> not block concurrent reads, even to parts of the file that the write
> will cover or has covered. The UNIX way is cooperative locking.

Thanks to the caches, large writes are fast - it's not like
they have to wait until it's committed to the disk.

>
> Where do you find this requirement (for ordinary files)?


http://www.opengroup.org/onlinepubs..._chap02_09.html

(Read "write(), thread safety, and POSIX" -
http://lwn.net/Articles/179829/ if you feel like it too..)
Thomas Maier-Komor

2006-06-12, 7:26 am

Gordon Burditt wrote:
>
> Now, it could be that the fflush(stdout) messed with the file
> pointer. You aren't supposed to mix buffered and unbuffered output
> on the same file descriptor.
>


mixing stream and regular file operations is dangerous on some
implementations, because they are not synchronized, although they
should. But in this specific example the stream objects are only used
within fflush which actually causes any buffers to be written to the
associated files. But as pointed out before, write operations must be
atomic as specified in IEEE1003.1, so the flush may not cause the
independent write()s to be overwritten.

In consequence the observed behavior only shows the very well known
Linux bug that some people seem to be unwilling to fix.

Tom
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com