|
Home > Archive > Unix Programming > May 2004 > Design question for memory access by processes
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Design question for memory access by processes
|
|
|
| I have 4 C programs that I have written that access reference files
very frequently. These reference files essentially are ordered records
in a simple ascii structure. The access method is via binary search.
Some of these reference files are common to all 4 programs and some
are of the order of 100 MB. Assuming I have enough memory, what are my
options in generating a Unix style "parent" process which places all
these common reference files into memory so that the heavy I/O to the
reference files is replaced by heavy IPC transactions to reference
data in memory?
The scenario is that (pre-processed) data comes into the system
continuously and has to go through the 4 C programs for processing and
interact with the reference data. So instead of having to malloc/free
a large chunk of reference data for each time (pre-processed) data
comes into the system, I can place that reference data into memory let
the incoming (pre-processed) data process with access to the large
reference data in memory.
I am not looking for complete answers, just hints to options of
subject areas I might want to look at.
Thanks in advance
Mike
| |
| Artie Gold 2004-05-11, 5:40 pm |
| Mike wrote:
> I have 4 C programs that I have written that access reference files
> very frequently. These reference files essentially are ordered records
> in a simple ascii structure. The access method is via binary search.
>
> Some of these reference files are common to all 4 programs and some
> are of the order of 100 MB. Assuming I have enough memory, what are my
> options in generating a Unix style "parent" process which places all
> these common reference files into memory so that the heavy I/O to the
> reference files is replaced by heavy IPC transactions to reference
> data in memory?
>
> The scenario is that (pre-processed) data comes into the system
> continuously and has to go through the 4 C programs for processing and
> interact with the reference data. So instead of having to malloc/free
> a large chunk of reference data for each time (pre-processed) data
> comes into the system, I can place that reference data into memory let
> the incoming (pre-processed) data process with access to the large
> reference data in memory.
>
> I am not looking for complete answers, just hints to options of
> subject areas I might want to look at.
>
> Thanks in advance
>
> Mike
If you mmap() the files with read-only access, the fact that you're
doing it in each process should introduce no additional overhead -- and
would be a *much* simpler solution than trying to do anything fancy with
multiple processes and IPC.
Of course, you should check how that works on your particular
implementation.
HTH,
--ag
--
Artie Gold -- Austin, Texas
| |
| Rich Gibbs 2004-05-11, 5:40 pm |
| Mike said the following, on 05/10/04 23:14:
> I have 4 C programs that I have written that access reference files
> very frequently. These reference files essentially are ordered records
> in a simple ascii structure. The access method is via binary search.
>
> Some of these reference files are common to all 4 programs and some
> are of the order of 100 MB. Assuming I have enough memory, what are my
> options in generating a Unix style "parent" process which places all
> these common reference files into memory so that the heavy I/O to the
> reference files is replaced by heavy IPC transactions to reference
> data in memory?
>
> The scenario is that (pre-processed) data comes into the system
> continuously and has to go through the 4 C programs for processing and
> interact with the reference data. So instead of having to malloc/free
> a large chunk of reference data for each time (pre-processed) data
> comes into the system, I can place that reference data into memory let
> the incoming (pre-processed) data process with access to the large
> reference data in memory.
>
> I am not looking for complete answers, just hints to options of
> subject areas I might want to look at.
>
Have a look at mmap(2). This system call allows you to 'map' a file, or
portion of a file, into the process's virtual (memory) address space.
If you have enough physical memory, then of course the data will be
resident there. Even if not all the data will fit into physical
memory, the program will still work as long as there is enough virtual
memory (i.e., swap space). The option flags to mmap give you a fair
degree of control over how this all works.
If the data is read-only, then only one copy of it is needed regardless
of how many processes are active at any given time. The kernel takes
care of all the bookkeeping, and if any I/O is required, it will be done
through the demand-paging mechanism, which is presumably at least as
efficient as application-level I/O. And your programs will be much
simpler than they would be if you try to synch it all yourself using IPC.
--
Rich Gibbs
rgibbs@his.com
| |
| Chuck Dillon 2004-05-11, 5:40 pm |
| Mike wrote:
> I have 4 C programs that I have written that access reference files
> very frequently. These reference files essentially are ordered records
> in a simple ascii structure. The access method is via binary search.
>
> Some of these reference files are common to all 4 programs and some
> are of the order of 100 MB. Assuming I have enough memory, what are my
> options in generating a Unix style "parent" process which places all
> these common reference files into memory so that the heavy I/O to the
> reference files is replaced by heavy IPC transactions to reference
> data in memory?
In addition to the earlier references to mmap() you could used shared
memory (which on some systems is implemented via mmap). See the man
pages for shmget, shmctl, shmat ...
With shared memory you can use an initialization process to load the
data into persistent shared memory then the processing programs can
simply attach (shmat) to that shared memory to use the data. By
persistent I mean the shared memory remains after the creator process
exits, until some process removes it or the system reboots.
HTH,
-- ced
>
> The scenario is that (pre-processed) data comes into the system
> continuously and has to go through the 4 C programs for processing and
> interact with the reference data. So instead of having to malloc/free
> a large chunk of reference data for each time (pre-processed) data
> comes into the system, I can place that reference data into memory let
> the incoming (pre-processed) data process with access to the large
> reference data in memory.
>
> I am not looking for complete answers, just hints to options of
> subject areas I might want to look at.
>
> Thanks in advance
>
> Mike
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
| |
| Barry Margolin 2004-05-11, 5:40 pm |
| In article <c7ql3b$t8q$1@grandcanyon.binc.net>,
Chuck Dillon <spam@nimblegen.com> wrote:
> Mike wrote:
>
>
> In addition to the earlier references to mmap() you could used shared
> memory (which on some systems is implemented via mmap). See the man
> pages for shmget, shmctl, shmat ...
>
> With shared memory you can use an initialization process to load the
> data into persistent shared memory then the processing programs can
> simply attach (shmat) to that shared memory to use the data. By
> persistent I mean the shared memory remains after the creator process
> exits, until some process removes it or the system reboots.
Since the data is coming from a file, it's usually preferable to use
mmap() rather than shared memory. The latter will require a needless
copy and wastes swap space.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
|
| Rich Gibbs <rgibbs@REMOVEhis.com> wrote in message news:<40a0527a@news101.his.com>...
> Mike said the following, on 05/10/04 23:14:
>
> Have a look at mmap(2). This system call allows you to 'map' a file, or
> portion of a file, into the process's virtual (memory) address space.
> If you have enough physical memory, then of course the data will be
> resident there. Even if not all the data will fit into physical
> memory, the program will still work as long as there is enough virtual
> memory (i.e., swap space). The option flags to mmap give you a fair
> degree of control over how this all works.
>
> If the data is read-only, then only one copy of it is needed regardless
> of how many processes are active at any given time. The kernel takes
> care of all the bookkeeping, and if any I/O is required, it will be done
> through the demand-paging mechanism, which is presumably at least as
> efficient as application-level I/O. And your programs will be much
> simpler than they would be if you try to synch it all yourself using IPC.
Thanks. Maybe I am reading this wrong but I have a question on the
following. The reference files remain static indefinitely, so let's
assume I mmap them with their file descriptors and this Unix program
is running in the background and all it is doing is keeping these
reference files in memory/swap.
Are you saying at any time I can run another unrelated program that
communicates with the file descriptors for these memory mapped
reference files? If so, how do I reference this memory mapped data in
another unrelated process?
I have seen examples of mmaping of data for a process, but not for
processes that are unrelated and run independently. I must be missing
something simple, but I can't figure it out. Any help is appreciated.
Mike
| |
| Barry Margolin 2004-05-17, 6:35 pm |
| In article <80214b41.0405171123.45531fe0@posting.google.com>,
mikesta@hotmail.com (Mike) wrote:
> Thanks. Maybe I am reading this wrong but I have a question on the
> following. The reference files remain static indefinitely, so let's
> assume I mmap them with their file descriptors and this Unix program
> is running in the background and all it is doing is keeping these
> reference files in memory/swap.
> Are you saying at any time I can run another unrelated program that
> communicates with the file descriptors for these memory mapped
> reference files? If so, how do I reference this memory mapped data in
> another unrelated process?
Any process can open the file and map it into their memory. When you
call mmap(), it makes the contents of the file appear to be part of your
process's memory. And all the processes that map the same file share
that memory (unless they use the MAP_PRIVATE option), so if one process
makes a change it will be seen by all the others (this doesn't seem
relevant to you, since you said the file remains static).
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Rich Gibbs 2004-05-17, 7:34 pm |
| Mike said the following, on 05/17/04 15:23:
> Rich Gibbs <rgibbs@REMOVEhis.com> wrote in message news:<40a0527a@news101.his.com>...
>
[snip]
>
>
> Thanks. Maybe I am reading this wrong but I have a question on the
> following. The reference files remain static indefinitely, so let's
> assume I mmap them with their file descriptors and this Unix program
> is running in the background and all it is doing is keeping these
> reference files in memory/swap.
> Are you saying at any time I can run another unrelated program that
> communicates with the file descriptors for these memory mapped
> reference files? If so, how do I reference this memory mapped data in
> another unrelated process?
>
> I have seen examples of mmaping of data for a process, but not for
> processes that are unrelated and run independently. I must be missing
> something simple, but I can't figure it out. Any help is appreciated.
>
You just haven't seen the right examples. ;-)
More than one process can 'mmap' a file. That was what I was alluding
to (but not sufficiently clearly -- sorry!) when I said that "the kernel
takes care of all the bookkeeping." Write your 4 (?) programs to mmap
the files they use. Assuming they are, as you've said, read only in
ordinary use, let the kernel figure out how to keep them in memory most
effectively. (Note that it does this already with respect to executable
images, when you run more than one instance of an application.)
(Also note that this does NOT mean that the file will be mapped at the
same virtual address in every process using it -- but that doesn't matter.)
--
Rich Gibbs
rgibbs AT alumni DOT princeton DOT edu
| |
| Chuck Dillon 2004-05-18, 12:42 pm |
| Barry Margolin wrote:
>
> Since the data is coming from a file, it's usually preferable to use
> mmap() rather than shared memory. The latter will require a needless
> copy and wastes swap space.
>
I'll take your word on "usually", you are probably right. But it's up
to the OP to decide if the copy is a needless waste in the context of
his design. As I understand it the reference data will not be accessed
sequentially (a binary search was mentioned). So if the incoming data
stream essentially accesses the reference data randomly the I/O
overhead of even mmap() may cost the OP more.
-- ced
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
| |
| Barry Margolin 2004-05-18, 2:50 pm |
| In article <c8d686$rhc$1@grandcanyon.binc.net>,
Chuck Dillon <spam@nimblegen.com> wrote:
> Barry Margolin wrote:
>
> I'll take your word on "usually", you are probably right. But it's up
> to the OP to decide if the copy is a needless waste in the context of
> his design. As I understand it the reference data will not be accessed
> sequentially (a binary search was mentioned). So if the incoming data
> stream essentially accesses the reference data randomly the I/O
> overhead of even mmap() may cost the OP more.
Whether you use read() or mmap(), the disk accesses will have to take
place. But if you read the entire file, you have the following types of
waste:
1. The disk blocks are first read into kernel buffers, and then copied
into the user memory.
2. The entire file is read, even if the applications only need to access
parts of it.
3. The shared memory is paged out to swap space rather than using the
original file as backing store. This means that swap space is wasted,
and also some writes have to be done even though the data was never
modified.
I suppose it's possible that there are scenarios where reading the
entire file into a shared memory segment would have an advantage, but I
think it would have to be a very special situation.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Chuck Dillon 2004-05-18, 4:47 pm |
| Barry Margolin wrote:
> In article <c8d686$rhc$1@grandcanyon.binc.net>,
> Chuck Dillon <spam@nimblegen.com> wrote:
>
>
>
>
> Whether you use read() or mmap(), the disk accesses will have to take
> place. But if you read the entire file, you have the following types of
> waste:
The original post describes a case where there is a continuous high
throughput stream of data that contains references to data in a set of
reference files. The implication is that one could expect that all of
the reference data will be referenced many times as the data streams
through.
The original post explicitly addresses I/O as a limiting factor and
suggests that loading the reference data into memory is a viable thing
to do.
If he loads the reference data into shared memory he reads the data
exactly once. This doesn't waste I/O cycles it saves them. If the
system has sufficient physical memory he can avoid/minimize paging.
This is a scenerio where shared memory can be a better solution than
mmap. I don't know that it is in this case but it's certainly worth
consideration, particularly if an mmap solution doesn't provide the
performance improvement he's looking for.
Also, the OP could consider a hybrid approach where he generates
indexes into the reference data and loads them into shared memory and
uses mmap on the bulk of the reference data. By loading the indexes
into memory they can be shared and the binary searches occur in memory.
That is if there are relatively small keys that can be indexed in the
reference data.
-- ced
>
> 1. The disk blocks are first read into kernel buffers, and then copied
> into the user memory.
>
> 2. The entire file is read, even if the applications only need to access
> parts of it.
>
> 3. The shared memory is paged out to swap space rather than using the
> original file as backing store. This means that swap space is wasted,
> and also some writes have to be done even though the data was never
> modified.
>
> I suppose it's possible that there are scenarios where reading the
> entire file into a shared memory segment would have an advantage, but I
> think it would have to be a very special situation.
>
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
| |
| Barry Margolin 2004-05-18, 5:39 pm |
| In article <c8dmud$vr0$1@grandcanyon.binc.net>,
Chuck Dillon <spam@nimblegen.com> wrote:
> The original post describes a case where there is a continuous high
> throughput stream of data that contains references to data in a set of
> reference files. The implication is that one could expect that all of
> the reference data will be referenced many times as the data streams
> through.
>
> The original post explicitly addresses I/O as a limiting factor and
> suggests that loading the reference data into memory is a viable thing
> to do.
>
> If he loads the reference data into shared memory he reads the data
> exactly once. This doesn't waste I/O cycles it saves them. If the
> system has sufficient physical memory he can avoid/minimize paging.
I don't see the distinction you're making. If he mmaps the file, he
also reads each block of the data exactly once -- the first time a
particular page of the reference file is accessed. And any pages that
aren't referenced will not be read at all, so there's a net savings.
I don't think there's any way that mmap() can do *more* disk accesses
than reading the entire file into memory. However, it's possible that
there can be some benefit to your scheme by shifting when the delays
occur -- perhaps a long pause at the beginning to load the data into
memory is better than occasional delays when an as-yet-unreferenced page
is being brought into memory for the first time. Of course, this
assumes that there's enough RAM that the reference file will never get
paged out, which can be a *big* if if there's lots of other data
streaming through the application. If it doesn't all stay in RAM, then
the LRU portions of the reference file will get paged out to the swap
partition, and they'll have to be paged back in just as it would to get
them from the reference file.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
|
| Barry Margolin <barmar@alum.mit.edu> wrote in message news:<barmar-DE9468.14240818052004@comcast.dca.giganews.com>...
> In article <c8d686$rhc$1@grandcanyon.binc.net>,
> Chuck Dillon <spam@nimblegen.com> wrote:
>
>
> Whether you use read() or mmap(), the disk accesses will have to take
> place. But if you read the entire file, you have the following types of
> waste:
>
> 1. The disk blocks are first read into kernel buffers, and then copied
> into the user memory.
>
> 2. The entire file is read, even if the applications only need to access
> parts of it.
>
> 3. The shared memory is paged out to swap space rather than using the
> original file as backing store. This means that swap space is wasted,
> and also some writes have to be done even though the data was never
> modified.
>
> I suppose it's possible that there are scenarios where reading the
> entire file into a shared memory segment would have an advantage, but I
> think it would have to be a very special situation.
Thanks for the info, all. It has made me have more questions than
answers, which is a good thing. My original idea was to place the
static reference data in memory/swap and leave it there indefinitely.
The C/UNIX programs would be run every x mins and would access the
data and do processing against it. My thought was that these C/UNIX
programs would not access I/O to get to the reference data but access
the reference data that was already pulled into memory at a startup
point, i.e. via some pointers. I may have things totally wacked in
terms of my thinking, but I only wanted to do the I/O once and just
reference it with programs that were run every x mins. The rationale
for this is that I will be receiving data to be processed every x mins
and will need to compare against the reference data that is static
(for long periods) of time.
Are you saying that I have to do I/O every time I run my C/UNIX
programs that do binary searches on the reference data? If so, is
there another way to do I/O once and be done with that. I was hoping
to not do I/O of all the reference data at every time I ran my
processing programs, given these reference data files are in the 100's
of MBs. That is, I was hoping to bring all reference data into memory
just once and run programs to read that data at frequent intervals.
Sorry if I have missed the point, but just checking I am on the right
track.
Thanks
Mike
| |
| Barry Margolin 2004-05-19, 2:36 am |
| In article <80214b41.0405181737.b8c0332@posting.google.com>,
mikesta@hotmail.com (Mike) wrote:
> Thanks for the info, all. It has made me have more questions than
> answers, which is a good thing. My original idea was to place the
> static reference data in memory/swap and leave it there indefinitely.
> The C/UNIX programs would be run every x mins and would access the
> data and do processing against it. My thought was that these C/UNIX
> programs would not access I/O to get to the reference data but access
> the reference data that was already pulled into memory at a startup
> point, i.e. via some pointers. I may have things totally wacked in
> terms of my thinking, but I only wanted to do the I/O once and just
> reference it with programs that were run every x mins. The rationale
> for this is that I will be receiving data to be processed every x mins
> and will need to compare against the reference data that is static
> (for long periods) of time.
>
> Are you saying that I have to do I/O every time I run my C/UNIX
> programs that do binary searches on the reference data? If so, is
> there another way to do I/O once and be done with that. I was hoping
> to not do I/O of all the reference data at every time I ran my
> processing programs, given these reference data files are in the 100's
> of MBs. That is, I was hoping to bring all reference data into memory
> just once and run programs to read that data at frequent intervals.
> Sorry if I have missed the point, but just checking I am on the right
> track.
The memory that application processes see is *virtual* memory, not real
memory. The OS automatically moves it back and forth from RAM to disk
as needed. This I/O is not programmed explicitly in the application, it
happens behind the scenes without you being able to tell (unless you
monitor execution time very closely). Normally, recently-used portions
of virtual memory will be in real memory, while portions that have not
been accessed in a while will have migrated to disk to make room for
other things (unless you have enough RAM that it doesn't fill up).
One exception is that privileged processes can "lock" parts of their
virtual memory into RAM if they want, by using the mprotect() system
call.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Chuck Dillon 2004-05-19, 5:40 pm |
| Barry Margolin wrote:
> In article <c8dmud$vr0$1@grandcanyon.binc.net>,
> Chuck Dillon <spam@nimblegen.com> wrote:
>
>
>
>
> I don't see the distinction you're making. If he mmaps the file, he
> also reads each block of the data exactly once -- the first time a
> particular page of the reference file is accessed. And any pages that
> aren't referenced will not be read at all, so there's a net savings.
>
> I don't think there's any way that mmap() can do *more* disk accesses
> than reading the entire file into memory. ...
I will freely admit that my understanding of mmap may be overly
conservative. But I know of no information that gives me a basis for
assuming that the underlying mmap implementation will read each block
exactly once even if there is sufficient memory in the system to hold
the entire file. I just read the mmap page at opengroup.org again and
still see no such basis. The documented behavior is that with mmap I
can address the contents of a file via a pointer. It doesn't assert
that I will get the best possible I/O performance but I know that to be
a widely held assumption of modern implementations. I also see no basis
for assuming exactly one read of each page even if sufficient memory
exists and I know of no widely held assumption on that point.
I also admit that I make few assumptions about the underlying
implementation of shared memory. But I know that there is no inherent
I/O component to it so I assume that all things being equal at best
mmap will match the performance of shared memory in the kind of
application that I think the OP is describing.
Can you please give me a reference that supports your description of
mmap behavior?
-- ced
>
> I don't think there's any way that mmap() can do *more* disk accesses
> than reading the entire file into memory. However, it's possible that
> there can be some benefit to your scheme by shifting when the delays
> occur -- perhaps a long pause at the beginning to load the data into
> memory is better than occasional delays when an as-yet-unreferenced page
> is being brought into memory for the first time. Of course, this
> assumes that there's enough RAM that the reference file will never get
> paged out, which can be a *big* if if there's lots of other data
> streaming through the application. If it doesn't all stay in RAM, then
> the LRU portions of the reference file will get paged out to the swap
> partition, and they'll have to be paged back in just as it would to get
> them from the reference file.
>
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
| |
| David Schwartz 2004-05-19, 8:36 pm |
| Chuck Dillon wrote:
> I will freely admit that my understanding of mmap may be overly
> conservative. But I know of no information that gives me a basis for
> assuming that the underlying mmap implementation will read each block
> exactly once even if there is sufficient memory in the system to hold
> the entire file.
You would have to have a pretty good understanding of the page caching
on modern UNIXes work to understand why this is almost certain to be the
case. You'd have to work to make this not happen.
> I just read the mmap page at opengroup.org again and
> still see no such basis.
Of course. The documentation doesn't require any particular virtual
memory or caching implementation. You will almost never find performance
information in standar manual pages because standards don't usually talk
about performance. It's up to each implementation to find the most efficient
way to implement each function.
> The documented behavior is that with mmap I
> can address the contents of a file via a pointer. It doesn't assert
> that I will get the best possible I/O performance but I know that to
> be a widely held assumption of modern implementations.
How could it ever? That would, for example, prevent an implementation
for providing its own super-fast direct I/O if it ever exceeded mmap's
performance under any circumstances.
> I also see no
> basis for assuming exactly one read of each page even if sufficient
> memory exists and I know of no widely held assumption on that point.
The basis is an understanding of how disk caching and virtual memory
work on modern operating systems.
> I also admit that I make few assumptions about the underlying
> implementation of shared memory.
Well don't make assumptions, learn how it actually works. Think about
what happens when you execute a file, for example. The file is simply mapped
into the process's memory space. The pages fault into the process' memory
space and the disk cache at the same time.
> Can you please give me a reference that supports your description of
> mmap behavior?
The source code of Linux, FreeBSD, any books on the internals of these
operating systems or Solaris.
DS
| |
| Barry Margolin 2004-05-19, 8:36 pm |
| In article <c8gds3$m7c$1@grandcanyon.binc.net>,
Chuck Dillon <spam@nimblegen.com> wrote:
> Barry Margolin wrote:
>
> I will freely admit that my understanding of mmap may be overly
> conservative. But I know of no information that gives me a basis for
> assuming that the underlying mmap implementation will read each block
> exactly once even if there is sufficient memory in the system to hold
> the entire file. I just read the mmap page at opengroup.org again and
> still see no such basis. The documented behavior is that with mmap I
> can address the contents of a file via a pointer. It doesn't assert
> that I will get the best possible I/O performance but I know that to be
> a widely held assumption of modern implementations. I also see no basis
> for assuming exactly one read of each page even if sufficient memory
> exists and I know of no widely held assumption on that point.
OK, so it doesn't say this about mmap'ed files. But I'll bet it doesn't
say it about the shared memory created with shmget(), either.
>
> I also admit that I make few assumptions about the underlying
> implementation of shared memory. But I know that there is no inherent
> I/O component to it so I assume that all things being equal at best
> mmap will match the performance of shared memory in the kind of
> application that I think the OP is describing.
Sure there's an I/O component to shared memory -- it uses the swap
partition as its backing store, while mmap() uses the file as its
backing store.
Except on a very unusual virtual memory implementation, I'd expect them
both to be treated exactly the same in terms of page replacement
strategy.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Rich Gibbs 2004-05-19, 11:40 pm |
| Chuck Dillon said the following, on 05/19/04 15:52:
> Barry Margolin wrote:
>
>
>
> I will freely admit that my understanding of mmap may be overly
> conservative. But I know of no information that gives me a basis for
> assuming that the underlying mmap implementation will read each block
> exactly once even if there is sufficient memory in the system to hold
> the entire file. I just read the mmap page at opengroup.org again and
> still see no such basis. The documented behavior is that with mmap I
> can address the contents of a file via a pointer. It doesn't assert
> that I will get the best possible I/O performance but I know that to be
> a widely held assumption of modern implementations. I also see no basis
> for assuming exactly one read of each page even if sufficient memory
> exists and I know of no widely held assumption on that point.
>
> I also admit that I make few assumptions about the underlying
> implementation of shared memory. But I know that there is no inherent
> I/O component to it so I assume that all things being equal at best mmap
> will match the performance of shared memory in the kind of application
> that I think the OP is describing.
>
> Can you please give me a reference that supports your description of
> mmap behavior?
>
In general, you will not see performance data or implementation detail
in the manual pages, any more than you will see information about math
library algorithmic details in the standard for the C language (for
example). That level is meant to describe the architecture of the
system as it is visible to the programmer, without constraining the
implementation by over-specifying it. I guess the underlying assumption
is that the programmer can learn of performance trade-offs either by
experiment or experience (his or someone else's).
The key thing here is that virtual memory in Linux is demand paged. (I
am pretty sure this is true of BSD Unices, too. It was certainly true
of SunOS back in the 4.1.x days, and that was derived from BSD.) Doing
an 'mmap' reserves the virtual address space needed to hold the file's
contents, but physical memory is not actually allocated (apart from
control information) until the page is actually referenced. (It's
similar to how 'fork' works: the new process has a copy of the forking
process's virtual address space, but the kernel doesn't make an extra
physical copy unless the child process modifies the contents of a page.)
So I would expect that, if there is enough memory, each pageful of
data from the reference file(s) will be read once, when it is first
used. If the files are read-only, then the original file is the
backing store, so no extra swap space is needed. Swapping may, of
course, occur if physical memory is insufficient in the whole system
scheme of things.
Apart from the source code itself, there are a couple of documents that
describe how Linux VM works. One is a paper by Mel Gorman, which you
can obtain here:
http://www.skynet.ie/~mel/projects/vm/
I gather that a book-length version is in the works.
There is also a Kernel Trap interview with Andrea Arcangeli, who was the
principal developer of the current Linux VM management code:
http://kerneltrap.org/node/view/3148
HTH,
/Rich
--
Rich Gibbs
rgibbs AT alumni DOT princeton DOT edu
| |
| Chuck Dillon 2004-05-20, 5:37 pm |
| Barry Margolin wrote:
>
> Except on a very unusual virtual memory implementation, I'd expect them
> both to be treated exactly the same in terms of page replacement
> strategy.
>
OK, I'm convinced. If the OP is using a modern OS (that includes Linux
right ;-) and puts the reference data on a storage device that has
performance characteristics comparable to the system swap areas or if
he ensures that there is sufficient physical memory to cache the entire
file then mmap is the most efficient way to go, because none of the
reference data ever gets copied into the swap area and the OS has no
prejudices in favor of shared memory segments versus mmap'd segments.
Have I got that right?
-- ced
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
| |
| Chuck Dillon 2004-05-20, 5:37 pm |
| David Schwartz wrote:
> Chuck Dillon wrote:
>
>
>
> Well don't make assumptions, learn how it actually works. Think about
> what happens when you execute a file, for example. The file is simply mapped
> into the process's memory space. The pages fault into the process' memory
> space and the disk cache at the same time.
The OP makes no mention of any specific system so I have no way to
experiment and measure. The OP was asking for suggestion so that he
might do such experiments.
Thanks for confirming what Barry was saying.
-- ced
--
Chuck Dillon
Senior Software Engineer
NimbleGen Systems Inc.
|
|
|
|
|