Unix Programming - can I ensure data integrity with mmap?

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > August 2006 > can I ensure data integrity with mmap?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author can I ensure data integrity with mmap?
Henry Townsend

2006-08-15, 7:20 pm

I've RTFM-ed but cannot be sure I correctly understood so ... what I
want to do is simple. There's a set of files being written to. As soon
as each file is "done" it needs to be uploaded to a server. This has
begun to cause serious performance degradation because (a) the files may
be large, (b) the server can be slow and, most importantly, (c) files
must be uploaded *immediately* upon being "done" because another process
may overwrite them at any time.

So my new idea is to mmap() the file as soon as it's ready, then stack
up a bunch of them and deliver them all at a better time. The idea is
that by having a file descriptor open to each file and with the right
set of flags to mmap, I can ensure that my view of the data is fixed as
of open/mmap time even if another process writes to it. I cannot find
such a guarantee in the documentation but I think/hope I'm just confused
about nomenclature. Is there a way to use mmap to guarantee an
unchanging view of file contents? I have no plan to write to the file
myself (so O_RDONLY and PROT_READ are fine), I just need to be protected
from writes by others.

MAP_PRIVATE says I can write to my mapping without disturbing anyone
else's view, but how do I get the reverse scenario?

Thanks,
HT
Rich Teer

2006-08-15, 7:20 pm

On Tue, 15 Aug 2006, Henry Townsend wrote:

> I've RTFM-ed but cannot be sure I correctly understood so ... what I want to
> do is simple. There's a set of files being written to. As soon as each file is
> "done" it needs to be uploaded to a server. This has begun to cause serious
> performance degradation because (a) the files may be large, (b) the server can
> be slow and, most importantly, (c) files must be uploaded *immediately* upon
> being "done" because another process may overwrite them at any time.


Not sure if mmap will work as you want, but how about renaming the file, and
stack up a bunch of those for later uploading? Provided the renamed file is
on the same file system as the original one, the rename will be atomic.

HTH,

--
Rich Teer, SCNA, SCSA, OpenSolaris CAB member

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich
Logan Shaw

2006-08-15, 7:20 pm

Henry Townsend wrote:
> I've RTFM-ed but cannot be sure I correctly understood so ... what I
> want to do is simple. There's a set of files being written to. As soon
> as each file is "done" it needs to be uploaded to a server. This has
> begun to cause serious performance degradation because (a) the files may
> be large, (b) the server can be slow and, most importantly, (c) files
> must be uploaded *immediately* upon being "done" because another process
> may overwrite them at any time.


(c) sounds like your fundamental problem here, and I can't see how
mmap() or anything else is going to help you with a situation where
your file can be overwritten before you can get around to trying to
save it.

> So my new idea is to mmap() the file as soon as it's ready, then stack
> up a bunch of them and deliver them all at a better time. The idea is
> that by having a file descriptor open to each file and with the right
> set of flags to mmap, I can ensure that my view of the data is fixed as
> of open/mmap time even if another process writes to it. I cannot find
> such a guarantee in the documentation but I think/hope I'm just confused
> about nomenclature. Is there a way to use mmap to guarantee an
> unchanging view of file contents?


I can't imagine how there would be. The computer only has a finite
amount of storage, and everything must either be in RAM or on disk
(or both). Having two separate views of a file would, in the general
case, require having two separate copies: one that will remain the
same, and one that can be changed. And as far as I know, no Unix
operating system does this copying, either by copying up front or
by reserving the space for a copy and then doing a copy-on-write.
To do so would require potentially a huge commitment of resources
when you do a mmap(). For example, you could have a 500 MB file, then
mmap() it read-only, then in another process modify the file, then
mmap() it read-only again in yet another process, then repeat 1000
times. The system would have to reserve 500 GB of space to ensure
that the writes could succeed. Alternatively, I guess it could fail
writes even to pieces of files that already exist (which shouldn't
fail due to lack of disk space, normally, since they are overwriting
N bytes with another N bytes, and thus not using up extra space),
but that would probably cause problems.

So, one possible solution that I can think of is to consider
unlinking your files from the directory that contains them. If
the problem is that some software you can't control (like an FTP
server) looks in a certain directory and likes to overwrite things
in there, you could rename() your file into some other directory
that the software doesn't know about. Since rename() only changes
directories and doesn't actually touch the file, as long as the
other process hadn't opened it prior to it moving, that would do
the trick. And rename() should run pretty fast since it only
changes directory entries and doesn't have to copy the contents
of the file.

- Logan
Henry Townsend

2006-08-16, 1:24 am

Logan Shaw wrote:
> (c) sounds like your fundamental problem here, and I can't see how
> mmap() or anything else is going to help you with a situation where
> your file can be overwritten before you can get around to trying to
> save it.


Sorry, I had a brain cramp and confused the guarantee that a file will
remain available even if *unlinked* if a file descriptor is open to it,
with a guarantee that the data seen via that file descriptor would not
change if it was subsequently *written to*. But clearly that's wrong.

> I can't imagine how there would be. The computer only has a finite
> amount of storage, and everything must either be in RAM or on disk
> (or both). Having two separate views of a file would, in the general
> case, require having two separate copies: one that will remain the
> same, and one that can be changed. And as far as I know, no Unix
> operating system does this copying, either by copying up front or
> by reserving the space for a copy and then doing a copy-on-write.
> To do so would require potentially a huge commitment of resources
> when you do a mmap(). For example, you could have a 500 MB file, then
> mmap() it read-only, then in another process modify the file, then
> mmap() it read-only again in yet another process, then repeat 1000
> times. The system would have to reserve 500 GB of space to ensure
> that the writes could succeed.


I'm sure you're right but not so sure I see your reasoning here. No
system can protect against attempts to use up capacity. By the same
token I could write a program to copy a 500 MB file over and over until
disk space is used up. Or, closer to the point, I could have a program
that mmaps a file MAP_PRIVATE, modifies it, then forks and repeats. The
mere fact that a resource is limited shouldn't make it illegal to use
it; you just throw an error when it's all gone.

Basically, I still don't understand why it's OK for process A to have a
private copy (MAP_PRIVATE) as long as process A makes the modifications,
but not OK when process B does so. There's two copies in play either
way, no?

> So, one possible solution that I can think of is to consider
> unlinking your files from the directory that contains them. If
> the problem is that some software you can't control (like an FTP
> server) looks in a certain directory and likes to overwrite things
> in there, you could rename() your file into some other directory
> that the software doesn't know about. Since rename() only changes
> directories and doesn't actually touch the file, as long as the
> other process hadn't opened it prior to it moving, that would do
> the trick. And rename() should run pretty fast since it only
> changes directory entries and doesn't have to copy the contents
> of the file.


The thing I didn't make sufficiently clear is that these are not temp
files; they have a legitimate right to exist under the name of their
choice. They're not "my" files; I simply need to upload copies of them
as of a particular time. And if a second process comes along and writes
to a file, that's again presumed legitimate. My program's job is not to
judge but only to record. So the classic FTP/rename hack is not an option.

Back to the drawing board...

Thanks,
HT
David Schwartz

2006-08-16, 1:24 am


Logan Shaw wrote:

> I can't imagine how there would be. The computer only has a finite
> amount of storage, and everything must either be in RAM or on disk
> (or both). Having two separate views of a file would, in the general
> case, require having two separate copies: one that will remain the
> same, and one that can be changed. And as far as I know, no Unix
> operating system does this copying, either by copying up front or
> by reserving the space for a copy and then doing a copy-on-write.


Nonense. Copy-on-write is the canonical way to implement private mmap's
of files.

> To do so would require potentially a huge commitment of resources
> when you do a mmap().


Right, same as when a large process forks.

> For example, you could have a 500 MB file, then
> mmap() it read-only, then in another process modify the file, then
> mmap() it read-only again in yet another process, then repeat 1000
> times. The system would have to reserve 500 GB of space to ensure
> that the writes could succeed. Alternatively, I guess it could fail
> writes even to pieces of files that already exist (which shouldn't
> fail due to lack of disk space, normally, since they are overwriting
> N bytes with another N bytes, and thus not using up extra space),
> but that would probably cause problems.


Most modern operating systems provide you with some way to configure
what level of overcommitment to allow.

A private writable mmap of a file doesn't take any more memory than a
private anonymous mmap of the same size. Both can be backed up by swap.

DS

James Antill

2006-08-16, 1:24 am

On Tue, 15 Aug 2006 19:41:43 -0400, Henry Townsend wrote:

> I've RTFM-ed but cannot be sure I correctly understood so ... what I
> want to do is simple. There's a set of files being written to. As soon
> as each file is "done" it needs to be uploaded to a server. This has
> begun to cause serious performance degradation because (a) the files may
> be large, (b) the server can be slow and, most importantly, (c) files
> must be uploaded *immediately* upon being "done" because another process
> may overwrite them at any time.
>
> So my new idea is to mmap() the file as soon as it's ready, then stack
> up a bunch of them and deliver them all at a better time. The idea is
> that by having a file descriptor open to each file and with the right
> set of flags to mmap, I can ensure that my view of the data is fixed as
> of open/mmap time even if another process writes to it. I cannot find
> such a guarantee in the documentation but I think/hope I'm just confused
> about nomenclature. Is there a way to use mmap to guarantee an
> unchanging view of file contents? I have no plan to write to the file
> myself (so O_RDONLY and PROT_READ are fine), I just need to be protected
> from writes by others.
>
> MAP_PRIVATE says I can write to my mapping without disturbing anyone
> else's view, but how do I get the reverse scenario?


What you want is the MAP_COPY flag ... your problem is that almost noone
implements it (hurd does), and Linus has gone on record saying he'll never
allow it because it's "stupid"[1].

On the upside, what you want is almost certainly much easier to do by
"fixing" your requirements. Drop mmap altogether, and either copy the
files somewhere (so the data isn't overwritten) or just move them ... then
sendfile + unlink later. Or fix the other processes that overwrite data
that is in use, so they don't do that.


[1] http://www.ussg.iu.edu/hypermail/li...110.1/1506.html

--
James Antill -- james@and.org
http://www.and.org/and-httpd

David Schwartz

2006-08-16, 7:30 am


David Schwartz wrote:

> Nonense. Copy-on-write is the canonical way to implement private mmap's
> of files.


This was a misleading statement in the context in which I said it.
While private mmaps are implemented as copy-on-write, a page that is
not resident cannot be copy-on-write.

DS

Nils O. Selåsdal

2006-08-16, 7:30 am

Henry Townsend wrote:
> I've RTFM-ed but cannot be sure I correctly understood so ... what I
> want to do is simple. There's a set of files being written to. As soon
> as each file is "done" it needs to be uploaded to a server. This has
> begun to cause serious performance degradation because (a) the files may
> be large, (b) the server can be slow and, most importantly, (c) files
> must be uploaded *immediately* upon being "done" because another process
> may overwrite them at any time.


How about just renaming the file to a unique name when it is done.
link("currentname","outgoing/someuniquename");
unlink("currentname");
Later on you send the files in outgoing/.

You need some way to assure the "currentname" isn't
overwritten until you get around doing the link/unlink.

It's unclear how your system works - who indicates to whom
the file is done being written to ?

You might be able to do the above link/unlink before the file
is finished, provided both hold the file open for the duration
of the entire operation (and doesn't try to reopen it using its
original name, thinking it's still the same.)
Barry Margolin

2006-08-16, 7:30 am

In article <1155716600.991534.221700@m79g2000cwm.googlegroups.com>,
"David Schwartz" <davids@webmaster.com> wrote:

> David Schwartz wrote:
>
>
> This was a misleading statement in the context in which I said it.
> While private mmaps are implemented as copy-on-write, a page that is
> not resident cannot be copy-on-write.


Sure it can -- COW applies to virtual memory, not physical memory.

However, as someone else pointed out, almost no implementations support
the MAP_COPY option, which is necessary to get this type of COW for
mapped files.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
Ulrich Eckhardt

2006-08-17, 1:28 am

Henry Townsend wrote:
> There's a set of files being written to. As soon
> as each file is "done" it needs to be uploaded to a server. This has
> begun to cause serious performance degradation because (a) the files may
> be large, (b) the server can be slow and, most importantly, (c) files
> must be uploaded *immediately* upon being "done" because another process
> may overwrite them at any time.


It's not a direct answer to what you are asking, but I would seriously
question your design. In fact the case you describe rather seems a case
for a database instead of manual file handling - after all, DBs are
designed to guarantee data integrity and allow reasonably fast editing of
single records in large datasets.

If you need it quick'n'dirty, you could also use Subversion (which is a
version control system, more or less a DB specialised for files). Using
it, each remote host has a "working copy", makes modifications there and
atomically (SVN guarantees that) uploads ("commit") them to the server. In
addition it already includes binary diff algorithms that preserve
bandwidth. On the server, you install a "post-commit hook" that then
handles the case whenever some change was uploaded.

Uli

--
http://www.erlenstar.demon.co.uk/unix/
Henry Townsend

2006-08-17, 1:19 pm

Ulrich Eckhardt wrote:
> Henry Townsend wrote:
>
> It's not a direct answer to what you are asking, but I would seriously
> question your design.


For the record, I don't think "my design" is in the picture here. My
product is a library which needs to track activities by programs which
link to the library. I have absolutely no control over how files are
created nor where they are placed, any more than a census worker gets to
tell people where to live. This is why I used the passive voice above. I
have control over the uploading algorithm but that's it.

Thank you all for clarifying my understanding of mmap. I continue to
look for alternatives but may need to look for optimizations elsewhere
(server side).

Thanks,
HT
Ulrich Eckhardt

2006-08-17, 1:19 pm

Henry Townsend wrote:
> Ulrich Eckhardt wrote:
>
> For the record, I don't think "my design" is in the picture here. My
> product is a library which needs to track activities by programs which
> link to the library. I have absolutely no control over how files are
> created nor where they are placed, any more than a census worker gets to
> tell people where to live. This is why I used the passive voice above. I
> have control over the uploading algorithm but that's it.


I didn't want to sound offensive, though rereading it it does sound a bit
arrogant - sorry for that. My impression was that you only needed some
data synced between computers but since your requirements say you need to
use files there is little an alternative solution can do.

sorry

Uli

--
http://www.erlenstar.demon.co.uk/unix/
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com