|
Home > Archive > Unix Programming > April 2006 > sharing mmap()'d address space between threads
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
sharing mmap()'d address space between threads
|
|
| mARK bLOORE 2006-04-27, 7:56 am |
| i would like to use mmap() to let several threads map a file into the
address space once, ie share the mapping. i had expected MAP_SHARED to
do that, but it does not. MAP_FIXED appears to be able to force it,
but the mmap() call happens in a library which i would like to avoid
modifying.
can anyone tell me just what MAP_SHARED means to mmap()? does it
really share a mapping between processes without using the same memory
in each? that implies that it must copy updates, to all mappings,
which seems strange.
below is some code i've used to test mmap() effects. without MAP_FIXED
the second mmap() call returns a different address than the first. i
get the same thing if the mmap() calls are done in different threads or
different processes.
#include <iostream>
#include <sys/types.h> // for struct stat?
#include <sys/stat.h> // for stat().
#include <sys/fcntl.h> // for O_RDONLY.
#include <sys/mman.h> // for mmap().
using namespace std;
int main (int argc, char* argv[])
{
int fd1 = open (argv[1], O_RDONLY);
int fd2 = open (argv[1], O_RDONLY);
if (fd1 == -1 || fd2 == -1)
{
cerr << "open (" << argv[1] << ") failed: " << errno << endl;
return 1;
}
else
{
struct stat info;
stat (argv[1], &info);
off_t size = info.st_size;
void* start = mmap (0, size, PROT_READ, MAP_SHARED, fd1, 0);
cout << "Mapped " << size << " bytes at " << hex << start << dec <<
endl;
// start = mmap (start, size, PROT_READ, MAP_SHARED | MAP_FIXED, fd2,
0);
start = mmap (0, size, PROT_READ, MAP_SHARED, fd2, 0);
cout << "Mapped " << size << " bytes at " << hex << start << dec <<
endl;
}
return 0;
}
| |
| Gordon Burditt 2006-04-27, 7:56 am |
| >i would like to use mmap() to let several threads map a file into the
>address space once, ie share the mapping. i had expected MAP_SHARED to
>do that, but it does not. MAP_FIXED appears to be able to force it,
>but the mmap() call happens in a library which i would like to avoid
>modifying.
>
>can anyone tell me just what MAP_SHARED means to mmap()? does it
>really share a mapping between processes without using the same memory
>in each?
My interpretation from the manual pages: it shares a mapping
between processes without using the same *VIRTUAL MEMORY ADDRESSES*
in each.
>that implies that it must copy updates, to all mappings,
>which seems strange.
No, it simply maps the memory into different address ranges in
different processes. Or, at least that's the sensible way
to do it.
Gordon L. Burditt
| |
| William Ahern 2006-04-27, 7:56 am |
| On Thu, 20 Apr 2006 14:45:27 -0700, mARK bLOORE wrote:
> i would like to use mmap() to let several threads map a file into the
> address space once, ie share the mapping. i had expected MAP_SHARED to
> do that, but it does not. MAP_FIXED appears to be able to force it,
> but the mmap() call happens in a library which i would like to avoid
> modifying.
Using MAP_FIXED you're not guaranteed the address you specify can be used
to map the region. Changing your compiler, linker, kernel, system clock or
just staring at the wall could make it start to fail.
> can anyone tell me just what MAP_SHARED means to mmap()? does it
> really share a mapping between processes without using the same memory
> in each? that implies that it must copy updates, to all mappings,
> which seems strange.
MAP_SHARED means just that, that updates can be seen by other processes.
No copying is done, though. The virtual memory subsystem takes care of the
magic. In a typical implementation all processes update the same backing
store, even if the update happens through different pointer values in each
process.
>
> below is some code i've used to test mmap() effects. without MAP_FIXED
> the second mmap() call returns a different address than the first. i
> get the same thing if the mmap() calls are done in different threads or
> different processes.
>
Yep, which is why you typically index into the shared region using
offsets*. Direct pointers into shared memory mapped regions are
tantalizing but dangerous and best avoided (unless the pointer is
ultimately derived in the process from an offset calculation).
* The exception is when child processes access the region; since fork()
preserves the address layout memory mapped regions opened before fork'ing
are useful for sharing data between related processes.
| |
| Paul Pluzhnikov 2006-04-27, 7:56 am |
| "mARK bLOORE" <mbloore@gmail.com> writes:
> i would like to use mmap() to let several threads map a file into the
> address space once, ie share the mapping.
The threads (assuming POSIX threads) *already* share all mappings.
Just map the file once, store the address returned by mmap() in a
global, and you can use it from any thread.
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
| |
| Barry Margolin 2006-04-27, 7:56 am |
| In article <pan.2006.04.20.22.19.28.541888@25thandClement.com>,
William Ahern <william@25thandClement.com> wrote:
> On Thu, 20 Apr 2006 14:45:27 -0700, mARK bLOORE wrote:
>
....[vbcol=seagreen]
>
> Yep, which is why you typically index into the shared region using
> offsets*. Direct pointers into shared memory mapped regions are
> tantalizing but dangerous and best avoided (unless the pointer is
> ultimately derived in the process from an offset calculation).
>
> * The exception is when child processes access the region; since fork()
> preserves the address layout memory mapped regions opened before fork'ing
> are useful for sharing data between related processes.
Another exception is when using threads, since all threads of a process
share the same address space.
The first line of the OP's message said he's using threads, not separate
processes, so I wonder why he's having any issues with this.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| mbloore 2006-04-27, 7:56 am |
| thanks for the information.
i was using MAP_FIXED to force re-use of the address space already
allocated for the mapping in the previous call. would that use be
independent of the phase of the moon?
my big problems are first that i don't want to modify the library that
does the mmap(), and second that i want to map a large file that
quickly exhausts the virtual address space if mapped to different
areas.
because the mmap() is done by a library, and the library produces
objects which are not thread safe, each thread must create its own
object, which makes its own mmap() call. the file is always read-only,
so the use of it is safe.
| |
| Barry Margolin 2006-04-27, 7:56 am |
| In article <1145653674.868091.148710@e56g2000cwe.googlegroups.com>,
"mbloore" <mbloore@gmail.com> wrote:
> thanks for the information.
>
> i was using MAP_FIXED to force re-use of the address space already
> allocated for the mapping in the previous call. would that use be
> independent of the phase of the moon?
I don't think you can map the same address multiple times at the same
time.
> my big problems are first that i don't want to modify the library that
> does the mmap(), and second that i want to map a large file that
> quickly exhausts the virtual address space if mapped to different
> areas.
>
> because the mmap() is done by a library, and the library produces
> objects which are not thread safe, each thread must create its own
> object, which makes its own mmap() call. the file is always read-only,
> so the use of it is safe.
How is that you're able to change the library to use MAP_FIXED, but
can't change it to simply reuse an existing mapping instead of calling
mmap() multiple times?
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| Paul Pluzhnikov 2006-04-27, 7:56 am |
| Barry Margolin <barmar@alum.mit.edu> writes:
>
> I don't think you can map the same address multiple times at the same
> time.
You can, kind of. From Solaris10 "man 2 mmap"
If a MAP_FIXED request is successful, the mapping established
by mmap() replaces any previous mappings for the process's
pages in the range [pa, pa + len).
So forcing mmap(, ... MAP_FIXED ...) for a read-only mapping at
the same address/offset/size is in effect exactly equivalent to
doing mmap once -- you replace previous data with the exact same
data.
> How is that you're able to change the library to use MAP_FIXED, but
> can't change it to simply reuse an existing mapping instead of calling
> mmap() multiple times?
Perhaps it is easier to binary-patch the instruction that loads
constant into mmap flags, than it is to modify the entire code path?
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
| |
| mbloore 2006-04-27, 7:56 am |
| > How is that you're able to change the library to use MAP_FIXED, but
> can't change it to simply reuse an existing mapping instead of calling
> mmap() multiple times?
my app is a server, which handles requests as they arrive. each one
maps the file it needs (usually, but not always, the same file). to
have them share a single mapping explicitly will mean a lot of extra
code to keep track of the exant mappings. since the OS already knows
what's mapped where, this seems like duplication that should be
unnecessary.
is there any way to get a list of mapped files from the OS? i'll still
have to hack the library, but at least it would be localized.
| |
|
| mbloore wrote:
>
>
> my app is a server, which handles requests as they arrive. each one
> maps the file it needs (usually, but not always, the same file). to
> have them share a single mapping explicitly will mean a lot of extra
> code to keep track of the exant mappings. since the OS already knows
> what's mapped where, this seems like duplication that should be
> unnecessary.
>
> is there any way to get a list of mapped files from the OS? i'll still
> have to hack the library, but at least it would be localized.
IMHO there is no real reason to do this.
It would only duplicate the administrative overhead which is already
done by the OS. Just { open() mmap() close() do_stuff() mmunmap() }
It is very hard to beat the OS's LRU cache (+name cache + inode cache).
For MAP_SHARED mappings, processes (even threads within a process,
though that is ugly) see the same RAM, backed-up by the same file,
even if they mapped it at different addresses.
For writers, there is of course the lost update (plus other
concurrency-) problem, but that is a different matter.
HTH,
AvK
|
|
|
|
|