Unix Programming - how to write to a file without race condition

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > July 2004 > how to write to a file without race condition





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author how to write to a file without race condition
googler

2004-07-15, 2:50 am

how to make a parent and all its children write to a file without race
condition and data inconsistency

Lev Walkin

2004-07-15, 2:50 am

googler wrote:
> how to make a parent and all its children write to a file without race
> condition and data inconsistency
>


man flock, man lockf, man fcntl

--
Lev Walkin
vlm@lionet.info
Frank Cusack

2004-07-15, 2:50 am

On 14 Jul 2004 22:32:50 -0700 "googler" <arun_ccjl@yahoo.co.in> wrote:
> how to make a parent and all its children write to a file without race
> condition and data inconsistency


choose /dev/null as the file
Alex Colvin

2004-07-16, 5:53 pm

>how to make a parent and all its children write to a file without race
>condition and data inconsistency


On at least some UNIX systems (Solaris),
lseek(...,SEEK_END)
followed by
write(...)

writes the the end of the file at the time of the write, not at the time
of the lseek. Several such writes will not clobber each other.
--
mac the naïf
Dan Mercer

2004-07-16, 5:53 pm


"googler" <arun_ccjl@yahoo.co.in> wrote in message news:cd54u2$scd@odbk17.prod.google.com...
: how to make a parent and all its children write to a file without race
: condition and data inconsistency
:

Ignore the other answers. Open the file with the O_APPEND flag:

from man write(2):

If the O_APPEND flag of the file status flags is set, the
file offset will be set to the end of the file prior to each
write and no intervening file modification operation will
occur between changing the file offset and the write opera-
tion.


Dan Mercer


Jens.Toerring@physik.fu-berlin.de

2004-07-16, 5:53 pm

Dan Mercer <dmercer@mn.rr.com> wrote:

> "googler" <arun_ccjl@yahoo.co.in> wrote in message news:cd54u2$scd@odbk17.prod.google.com...
> : how to make a parent and all its children write to a file without race
> : condition and data inconsistency
> :


> Ignore the other answers. Open the file with the O_APPEND flag:


> from man write(2):


or open(2) (at least on my machine)

> If the O_APPEND flag of the file status flags is set, the
> file offset will be set to the end of the file prior to each
> write and no intervening file modification operation will
> occur between changing the file offset and the write opera-
> tion.


Some man pages warn that this may not work with NFS-mounted files
since at least some NFS implementations do not support appending
and the client kernel has to fake it by lseek/write calls which
are then open to the race condition again. Then a lock with fcntl()
might be the only save thing (given that the NFS implementation
supports fcntl()-locks).
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Mohun Biswas

2004-07-16, 5:53 pm

Dan Mercer wrote:
> "googler" <arun_ccjl@yahoo.co.in> wrote in message news:cd54u2$scd@odbk17.prod.google.com...
> : how to make a parent and all its children write to a file without race
> : condition and data inconsistency
> :
>
> Ignore the other answers. Open the file with the O_APPEND flag:
>
> from man write(2):
>
> If the O_APPEND flag of the file status flags is set, the
> file offset will be set to the end of the file prior to each
> write and no intervening file modification operation will
> occur between changing the file offset and the write opera-
> tion.


But note that this leaves room (as I read it) for multiple processes to
compete for non-atomic writes. I.e. if one process opens the file for
append and then makes two write() system calls, each is guaranteed
atomic but the two together are not; another process could manage to
insert some data between them. This is particularly a concern when using
stdio since buffering is imposed under the covers and thus what looks
like a single fwrite/fprintf to the user may in fact result in multiple
write() system calls.

In my app I use a combination of O_APPEND and fcntl locking and it seems
to work fine (but my app only writes to local files).

--
Thanks,
M.Biswas
James Antill

2004-07-16, 8:50 pm

On Fri, 16 Jul 2004 15:48:25 +0000, Alex Colvin wrote:

>
> On at least some UNIX systems (Solaris),
> lseek(...,SEEK_END)
> followed by
> write(...)
>
> writes the the end of the file at the time of the write, not at the time
> of the lseek.


This is true.

> Several such writes will not clobber each other.


This is not true.
The "end of file" will be relative to the end of the file at the time
lseek was called. So multiple processes might run the lseek call, then
each will start the write at the same position.

--
James Antill -- james@and.org
Need an efficient and powerful string library for C?
http://www.and.org/vstr/

James

2004-07-17, 5:52 pm

On Fri, 16 Jul 2004 17:02:17 -0400, James Antill
<james-netnews@and.org> wrote:

>On Fri, 16 Jul 2004 15:48:25 +0000, Alex Colvin wrote:
>
>
> This is true.
>
>
> This is not true.
> The "end of file" will be relative to the end of the file at the time
>lseek was called.


No - as above, the end of file is that at the time write() is called.

>So multiple processes might run the lseek call, then
>each will start the write at the same position.


No, each of them will atomically seek to the current EOF, append their
data, then the next one will write at the new EOF. They won't
overwrite each other unless they're running over NFS (which introduces
a race condition) or they used lseek with an explicit offset rather
than SEEK_END.


James.
James Antill

2004-07-17, 5:52 pm

On Sat, 17 Jul 2004 16:34:32 +0100, James wrote:

> On Fri, 16 Jul 2004 17:02:17 -0400, James Antill
> <james-netnews@and.org> wrote:
>
>
> No - as above, the end of file is that at the time write() is called.


No, it writes the end of the file ... so if you extend the file, the hole
isn't created by the seek. But the return value is defined as:

Upon successful completion, the resulting offset, as meas-
ured in bytes from the beginning of the file, is returned.

....this means the seek to END can't happen in an atomic transaction with
the write call. O_APPEND is there for a reason.

>
> No, each of them will atomically seek to the current EOF, append their
> data, then the next one will write at the new EOF.


While I don't currently have access to try it myself on a Solaris
box, feel free to run a few instances of the following. I'm 99% sure
Solaris isn't completely broken in the way you suggest, and the file will
only grow by 50 bytes for the entire interval...

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <err.h>

#define CONF_FNAME "abcd"
#define CONF_DATA "123456789 123456789 123456789 123456789 1234567890"
#define CONF_INTERVAL 10

int main(void)
{
int fd = open(CONF_FNAME, O_WRONLY);

if (fd == -1) err(EXIT_FAILURE, "open");
if (lseek(fd, 0, SEEK_END) == -1)
err(EXIT_FAILURE, "seek");
sleep(CONF_INTERVAL);
if (write(fd, CONF_DATA, sizeof(CONF_DATA) - 1) == -1)
err(EXIT_FAILURE, "write");

return EXIT_SUCCESS;
}

--
James Antill -- james@and.org
Need an efficient and powerful string library for C?
http://www.and.org/vstr/

James

2004-07-17, 5:52 pm

On Sat, 17 Jul 2004 13:40:28 -0400, James Antill
<james-netnews@and.org> wrote:

>On Sat, 17 Jul 2004 16:34:32 +0100, James wrote:
>
>
> No, it writes the end of the file ... so if you extend the file, the hole
>isn't created by the seek. But the return value is defined as:
>
> Upon successful completion, the resulting offset, as meas-
> ured in bytes from the beginning of the file, is returned.
>
>...this means the seek to END can't happen in an atomic transaction with
>the write call.


Actually it doesn't necessarily mean that - lseek(SEEK_END) *could* be
implemented as setting the O_APPEND flag, as the previous poster
appeared to believe, while still returning the current size of the
file. According to the man pages (for both Solaris and Linux) and
experimentation on Linux, however, this isn't the case.

>
> While I don't currently have access to try it myself on a Solaris
>box, feel free to run a few instances of the following. I'm 99% sure
>Solaris isn't completely broken in the way you suggest, and the file will
>only grow by 50 bytes for the entire interval...


You're right this time round: when Alex Colvin said that
lseek(SEEK_END)...write() would write "the end of the file at the time
of the write, not at the time of the lseek", which you said was right,
you were both wrong, and you were right when you then said "This is
not true. The "end of file" will be relative to the end of the file at
the time lseek was called. So multiple processes might run the lseek
call, then each will start the write at the same position."


James.
Casper H.S. Dik

2004-07-17, 5:52 pm

James <jas@spamcop.net> writes:

>No, each of them will atomically seek to the current EOF, append their
>data, then the next one will write at the new EOF. They won't
>overwrite each other unless they're running over NFS (which introduces
>a race condition) or they used lseek with an explicit offset rather
>than SEEK_END.



Why do you believe that lseek(..., SEEK_END) has this magic
property? lseek sets the file pointer and the only thing "SEEK_END"
does is to set the file pointer relative to the current end.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
James

2004-07-18, 7:50 am

On 17 Jul 2004 23:08:30 GMT, Casper H.S. Dik <Casper.Dik@Sun.COM>
wrote:

>James <jas@spamcop.net> writes:
>
>
>Why do you believe that lseek(..., SEEK_END) has this magic
>property?


That's what Alex Colvin said earlier in the thread. It *doesn't* have
this property (which is hardly "magic", since it amounts to setting
the O_APPEND flag!), but until I checked for myself his explanation
seemed reasonable, particularly in the absence of any challenge until
yours. (James Antill *agreed* with it, then changed his mind
mid-post...)

> lseek sets the file pointer and the only thing "SEEK_END"
>does is to set the file pointer relative to the current end.


Correct. As I said earlier, to achieve the result described you need
to set O_APPEND rather than using lseek().


James.
James

2004-07-18, 7:50 am

On Fri, 16 Jul 2004 19:33:18 GMT, Mohun Biswas <m.biswas@invalid.addr>
wrote:

>Dan Mercer wrote:
>
>But note that this leaves room (as I read it) for multiple processes to
>compete for non-atomic writes. I.e. if one process opens the file for
>append and then makes two write() system calls, each is guaranteed
>atomic but the two together are not; another process could manage to
>insert some data between them. This is particularly a concern when using
>stdio since buffering is imposed under the covers and thus what looks
>like a single fwrite/fprintf to the user may in fact result in multiple
>write() system calls.


Yes: if you want atomicity of a *set* of write() calls, rather than
each individual call, you either need locking of some kind, or
buffering to combine all the writes into a single call to write() or
writev(). Of course, this rules out using stdio unless you use a
modified version...

>In my app I use a combination of O_APPEND and fcntl locking and it seems
>to work fine (but my app only writes to local files).


If you're trying to append a couple of separate blocks of data
atomically, I'd go for writev() if possible. What's your app?


James.
James Antill

2004-07-18, 5:55 pm

On Sat, 17 Jul 2004 23:18:11 +0100, James wrote:

> Actually it doesn't necessarily mean that - lseek(SEEK_END) *could* be
> implemented as setting the O_APPEND flag, as the previous poster


No it couldn't ... read the definition again. If it was implemented this
way the return value would have to be defined differently.

> You're right this time round: when Alex Colvin said that
> lseek(SEEK_END)...write() would write "the end of the file at the time
> of the write, not at the time of the lseek", which you said was right,


Because it is right, if you do...

lseek(fd, 1024 * 1024, SEEK_END);
write(fd, "a", 1);

....then the "end of file" marker is only changed (and 1 MB hole created)
on disk at the time of the write, I assumed this is what Alex was thinking
about.
However, as I said, this doesn't mean that several of the above won't
clobber each other.

--
James Antill -- james@and.org
Need an efficient and powerful string library for C?
http://www.and.org/vstr/

James

2004-07-18, 5:55 pm

On Sun, 18 Jul 2004 11:10:48 -0400, James Antill
<james-netnews@and.org> wrote:

>On Sat, 17 Jul 2004 23:18:11 +0100, James wrote:
>
>
> No it couldn't ... read the definition again. If it was implemented this
>way the return value would have to be defined differently.


No, as I explained, the return value would still be the offset at the
time it is returned, precisely as documented. (Obviously, this is the
only possibility: lseek() cannot possibly be expected to return the
future offset instead!) This would not preclude the offset changing
subsequently, since of course lseek's return value can have no bearing
on future changes. If you're claiming that lseek()'s semantics somehow
preclude subsequent syscalls changing the offset, there's something
wrong.

>
> Because it is right, if you do...
>
> lseek(fd, 1024 * 1024, SEEK_END);
> write(fd, "a", 1);
>
>...then the "end of file" marker is only changed (and 1 MB hole created)
>on disk at the time of the write, I assumed this is what Alex was thinking
>about.


I took the statement that lseek(SEEK_END)...write() "writes the the
end of the file at the time of the write, not at the time
of the lseek. Several such writes will not clobber each other.
-- " to mean that the write() writes TO the end of the file, rather
than that the write() call CHANGES the end of the file. The former is
the only interpretation which fits with his subsequent sentence that
this prevents conflicts between multiple write() calls.

I think the most likely explanation is that AC had confused SEEK_END
with O_APPEND. This seems to be the only explanation which fits with
his final sentence, unless the "not" was added by mistake...

> However, as I said, this doesn't mean that several of the above won't
>clobber each other.


Which is also what I said, since you need O_APPEND rather than
lseek(SEEK_END) to get the semantics described earlier - but you'll
see Alex Colvin's original post specifically claimed "Several such
writes will not clobber each other."


James.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com