|
Home > Archive > Unix Programming > April 2005 > mmap() a file into three pieces
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
mmap() a file into three pieces
|
|
| tuantran167@gmail.com 2005-03-22, 2:51 am |
| Hi all,
I am new on this list and also a newbie on unix programming. I have a
question concerning mmap(). I have a large file (~1GB). I would like to
access this file using mmap(). However, I want to mmap it into three
pieces. However, I was not able to do it. I wonder if anyone can
provide an example of how to do it.
TIA,
TAT
| |
| Artie Gold 2005-03-23, 2:52 am |
| tuantran167@gmail.com wrote:
> Hi all,
>
> I am new on this list and also a newbie on unix programming. I have a
> question concerning mmap(). I have a large file (~1GB). I would like to
> access this file using mmap(). However, I want to mmap it into three
> pieces. However, I was not able to do it. I wonder if anyone can
> provide an example of how to do it.
>
> TIA,
> TAT
>
What have you tried (post code)? It looks pretty straightforward to me
-- three mmap() calls on the same fd with different (page aligned)
offsets, the first one likely zero.
HTH,
--ag
--
Artie Gold -- Austin, Texas
http://it-matters.blogspot.com (new post 12/5)
http://www.cafepress.com/goldsays
| |
| Barry Margolin 2005-03-23, 2:52 am |
| In article <3ac7rlF652jlbU1@individual.net>,
Artie Gold <artiegold@austin.rr.com> wrote:
> tuantran167@gmail.com wrote:
> What have you tried (post code)? It looks pretty straightforward to me
> -- three mmap() calls on the same fd with different (page aligned)
> offsets, the first one likely zero.
My guess is that the section of the address space used for mmap'ed files
isn't big enough for all three pieces to be mapped at once.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Artie Gold 2005-03-23, 2:52 am |
| Barry Margolin wrote:
> In article <3ac7rlF652jlbU1@individual.net>,
> Artie Gold <artiegold@austin.rr.com> wrote:
>
>
>
>
> My guess is that the section of the address space used for mmap'ed files
> isn't big enough for all three pieces to be mapped at once.
>
From an address space perspective, it shouldn't be a problem (depending
on whatever else is going on, of course). Perhaps "what do you *really*
want to do?" is the most appropriate question. ;-)
And perror() isn't bad either.
Cheers,
--ag
--
Artie Gold -- Austin, Texas
http://it-matters.blogspot.com (new post 12/5)
http://www.cafepress.com/goldsays
| |
| tuantran167@gmail.com 2005-03-23, 6:09 pm |
| Thanks all of you for trying to answer my absolutely vague question.
I create my code based on what I pasted below (I think I got it from
somewhere). I think that I didn't understand off_t correctly. That's
why my code is not working. I would appreciate if you can help me to
understand. Thanks very much.
// my_mmap.h
#include <string>
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/time.h>
using namespace std;
class MmapError {};
#define MMAP_ASSERT(ok) \
do { \
if(!(ok)) \
{ \
cerr << "Error: " << __FILE__ << " (" << __LINE__ << ")" << endl; \
throw MmapError(); \
} \
} while (0)
class My_Mmap {
public:
static const int invalid_fd = -1;
static const int invalid_pos= -1;
My_Mmap();
~My_Mmap();
void Open(const string& fileName, off_t offset);
int Read(char* buffer, int size) const;
void Close();
bool IsEnd() const;
private:
int m_Fd; void* m_P; int m_Size; mutable int m_Idx;
};
// my_mmap.cpp
#include "tat_mmap.h"
using namespace std;
My_Mmap::My_Mmap() : m_Fd(invalid_fd), m_P(0), m_Size(0), m_Idx(0) {}
My_Mmap::~My_Mmap()
{
try
{
Close();
}
catch(...) {}
};
void My_Mmap::Open(const string& fileName, off_t offset)
{
// Make sure that file name is not empty
MMAP_ASSERT(!fileName.empty());
// Make sure that file exists
struct stat stat_buf;
MMAP_ASSERT(!stat(fileName.c_str(), &stat_buf));
// Close existing file if any
Close();
m_Fd = open(fileName.c_str(), O_RDWR, 0666);
MMAP_ASSERT(m_Fd > 0);
if(fstat(m_Fd, &stat_buf))
{
close(m_Fd);
m_Fd = invalid_fd;
MMAP_ASSERT(0);
}
m_Size = stat_buf.st_size;
m_P = mmap(0, m_Size, PROT_WRITE | PROT_READ, MAP_PRIVATE, m_Fd,
offset);
m_Idx = 0;
if(m_P == (caddr_t)(-1))
{
close(m_Fd);
m_Fd = invalid_fd;
m_P = 0;
MMAP_ASSERT(0);
}
};
void My_Mmap::Close()
{
if(m_Fd != invalid_fd && m_P)
{
MMAP_ASSERT(m_Size >= 0);
int err1 = munmap(m_P, m_Size);
int err2 = close(m_Fd);
m_P = 0;
m_Size = 0;
m_Fd = invalid_fd;
m_Idx = 0;
MMAP_ASSERT(!err1);
MMAP_ASSERT(!err2);
}
};
int My_Mmap::Read(char* buffer, int size) const
{
MMAP_ASSERT(m_P);
MMAP_ASSERT(m_Size > 0);
MMAP_ASSERT(size > 1);
MMAP_ASSERT(m_Idx <= m_Size);
int data_size = size - 1;
if(data_size > m_Size - m_Idx)
data_size = m_Size - m_Idx;
memcpy(buffer, static_cast<char*>(m_P) + m_Idx, data_size);
buffer[data_size] = 0;
m_Idx += data_size;
return data_size;
};
bool My_Mmap::IsEnd() const {
return m_Idx = m_Size;
}
// main
#include "my_mmap.h"
int main() {
My_Mmap mmapFile;
string filename = "anyname";
off_t offset = 2;
mmapFile.Open(filename,2);
while (!mmapFile.IsEnd()) {
char buffer[20];
int size = mmapFile.Read(buffer,30);
cout << buffer;
}
exit(0);
}
| |
| Barry Margolin 2005-03-23, 8:52 pm |
| In article <1111609571.612648.49500@g14g2000cwa.googlegroups.com>,
"tuantran167@gmail.com" <tuantran167@gmail.com> wrote:
> Thanks all of you for trying to answer my absolutely vague question.
> I create my code based on what I pasted below (I think I got it from
> somewhere). I think that I didn't understand off_t correctly. That's
> why my code is not working. I would appreciate if you can help me to
> understand. Thanks very much.
You haven't told us what "is not working". Are you getting an error?
Also, I don't see the "three pieces" in the code you provided. You just
mmap the file once.
The only actual error I noticed was that m_Size should be:
m_Size = stat_buf.st_size - offset;
Otherwise, you're trying to map beyond the end of the file.
BTW, after you call mmap() you don't need to keep the file open. I'd
probably close the fd in My_Mmap::Open rather than My_Mmap::Close.
>
>
> // my_mmap.h
>
> #include <string>
> #include <iostream>
>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <sys/time.h>
>
> using namespace std;
> class MmapError {};
>
> #define MMAP_ASSERT(ok) \
> do { \
> if(!(ok)) \
> { \
> cerr << "Error: " << __FILE__ << " (" << __LINE__ << ")" << endl; \
> throw MmapError(); \
> } \
> } while (0)
>
> class My_Mmap {
> public:
>
> static const int invalid_fd = -1;
> static const int invalid_pos= -1;
>
> My_Mmap();
> ~My_Mmap();
>
> void Open(const string& fileName, off_t offset);
> int Read(char* buffer, int size) const;
> void Close();
> bool IsEnd() const;
> private:
> int m_Fd; void* m_P; int m_Size; mutable int m_Idx;
> };
>
> // my_mmap.cpp
>
> #include "tat_mmap.h"
>
> using namespace std;
>
> My_Mmap::My_Mmap() : m_Fd(invalid_fd), m_P(0), m_Size(0), m_Idx(0) {}
>
> My_Mmap::~My_Mmap()
> {
> try
> {
> Close();
> }
> catch(...) {}
> };
>
> void My_Mmap::Open(const string& fileName, off_t offset)
> {
> // Make sure that file name is not empty
> MMAP_ASSERT(!fileName.empty());
>
> // Make sure that file exists
> struct stat stat_buf;
> MMAP_ASSERT(!stat(fileName.c_str(), &stat_buf));
>
> // Close existing file if any
> Close();
>
> m_Fd = open(fileName.c_str(), O_RDWR, 0666);
>
> MMAP_ASSERT(m_Fd > 0);
>
> if(fstat(m_Fd, &stat_buf))
> {
> close(m_Fd);
> m_Fd = invalid_fd;
> MMAP_ASSERT(0);
> }
>
> m_Size = stat_buf.st_size;
> m_P = mmap(0, m_Size, PROT_WRITE | PROT_READ, MAP_PRIVATE, m_Fd,
> offset);
> m_Idx = 0;
>
> if(m_P == (caddr_t)(-1))
> {
> close(m_Fd);
> m_Fd = invalid_fd;
> m_P = 0;
> MMAP_ASSERT(0);
> }
> };
>
> void My_Mmap::Close()
> {
> if(m_Fd != invalid_fd && m_P)
> {
> MMAP_ASSERT(m_Size >= 0);
>
> int err1 = munmap(m_P, m_Size);
> int err2 = close(m_Fd);
>
> m_P = 0;
> m_Size = 0;
> m_Fd = invalid_fd;
> m_Idx = 0;
>
> MMAP_ASSERT(!err1);
> MMAP_ASSERT(!err2);
> }
> };
> int My_Mmap::Read(char* buffer, int size) const
> {
> MMAP_ASSERT(m_P);
> MMAP_ASSERT(m_Size > 0);
> MMAP_ASSERT(size > 1);
> MMAP_ASSERT(m_Idx <= m_Size);
>
> int data_size = size - 1;
>
> if(data_size > m_Size - m_Idx)
> data_size = m_Size - m_Idx;
>
> memcpy(buffer, static_cast<char*>(m_P) + m_Idx, data_size);
>
> buffer[data_size] = 0;
> m_Idx += data_size;
>
> return data_size;
> };
>
> bool My_Mmap::IsEnd() const {
> return m_Idx = m_Size;
> }
> // main
>
> #include "my_mmap.h"
>
> int main() {
>
> My_Mmap mmapFile;
> string filename = "anyname";
> off_t offset = 2;
>
> mmapFile.Open(filename,2);
> while (!mmapFile.IsEnd()) {
> char buffer[20];
> int size = mmapFile.Read(buffer,30);
> cout << buffer;
> }
>
> exit(0);
> }
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
|
| Hi Barry,
Thanks for pointing an error and your suggestion. What I want to do is
the following:
Whenever, member function Open of My_Mmap is called, a portion of a
file will be mmaped. Suppose the
file "testfile" has exactly 300 kB. When the first Open(testfile,0) is
called the first portion of the
file from byte 1 upto 100kB will be mmapped. Then the second
Open(testfile,1) is called, the second
portion from byte 100001 to 200kB will be mapped and so on. I want to
leave an option that I can
either mmap all three pieces at the same time or mmap one or two pieces
at once depending on a
machine. I am kind of confused what offset value I should put for each.
For the first part, what I did
was to fix m_Size to be 100kB and off_t to be 0. For other part, the
program just give me an error when it checks MMAP_ASSERT(0). I think I
don't know how to calculate offset value. I hope that you can help me.
TIA
| |
| Barry Margolin 2005-03-27, 5:54 pm |
| In article <1111941997.015068.173260@g14g2000cwa.googlegroups.com>,
"tat" <tuantran167@gmail.com> wrote:
> Hi Barry,
>
> Thanks for pointing an error and your suggestion. What I want to do is
> the following:
>
> Whenever, member function Open of My_Mmap is called, a portion of a
> file will be mmaped. Suppose the
> file "testfile" has exactly 300 kB. When the first Open(testfile,0) is
> called the first portion of the
> file from byte 1 upto 100kB will be mmapped. Then the second
> Open(testfile,1) is called, the second
> portion from byte 100001 to 200kB will be mapped and so on. I want to
> leave an option that I can
> either mmap all three pieces at the same time or mmap one or two pieces
> at once depending on a
> machine. I am kind of confused what offset value I should put for each.
> For the first part, what I did
> was to fix m_Size to be 100kB and off_t to be 0. For other part, the
> program just give me an error when it checks MMAP_ASSERT(0). I think I
> don't know how to calculate offset value. I hope that you can help me.
I didn't see anything in your code that fixed m_Size to be 100kB. All I
saw was m_Size=stat_buf.st_size, which sets the size to the total size
of the file.
The offset that you pass to mmap() should be 100000*offset.
If mmap() is reporting an error, you should look at errno to see what
it's complaining about.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
|
| Hi Barry,
Thanks for your reply.
I looked at man page of mmap() in linux. it says that offset should be
a multiple of page size which is returned by getpagesize(). For my
system, this number is 4096. I don't know whether it is the same for
other systems. So run a test with the following function
void My_Mmap::Open(const string& fileName, int nbytes, off_t offset)
{
// Make sure that file name is not empty
MMAP_ASSERT(!fileName.empty());
// Make sure that file exists
struct stat stat_buf;
MMAP_ASSERT(!stat(fileName.c_str(), &stat_buf));
// Close existing file if any
Close();
m_Fd = open(fileName.c_str(), O_RDWR, 0666);
MMAP_ASSERT(m_Fd > 0);
if(fstat(m_Fd, &stat_buf))
{
close(m_Fd);
m_Fd = invalid_fd;
MMAP_ASSERT(0);
}
m_Size = nbytes;
m_P = mmap(0, m_Size, PROT_WRITE | PROT_READ, MAP_PRIVATE, m_Fd,
offset);
m_Idx = 0;
if(m_P == (caddr_t)(-1))
{
close(m_Fd);
m_Fd = invalid_fd;
m_P = 0;
MMAP_ASSERT(0);
}
};
and nbytes = 54; If offset = 4096, I got exact 54 bytes starting from
the 4097th byte of the file which is expected. My question is if I want
to mmap starting from 4034th byte, how should I do it? If I simply take
offset = 4034, then I got compile error. Basically, it says something
wrong about m_P (pointer to the memory location where the file begins).
TAT
| |
| Barry Margolin 2005-04-08, 8:48 pm |
| In article <1112978133.133071.258030@f14g2000cwb.googlegroups.com>,
"tat" <tuantran167@gmail.com> wrote:
> Hi Barry,
>
> Thanks for your reply.
>
> I looked at man page of mmap() in linux. it says that offset should be
> a multiple of page size which is returned by getpagesize(). For my
> system, this number is 4096. I don't know whether it is the same for
> other systems. So run a test with the following function
....
> and nbytes = 54; If offset = 4096, I got exact 54 bytes starting from
> the 4097th byte of the file which is expected. My question is if I want
> to mmap starting from 4034th byte, how should I do it? If I simply take
> offset = 4034, then I got compile error. Basically, it says something
> wrong about m_P (pointer to the memory location where the file begins).
You get a compile error? How does the compiler know that the offset is
going to be 4034? I think you mean a runtime error.
Anyway, I think you know the answer -- the offset has to be a multiple
of the page size. To be sure, you really need to save errno after
mmap() returns (didn't I recommend this earlier in the thread?).
To make your code portable, you should use sysconf() to determine the
page size. Then you can round the offset down to a multiple of the page
size, and adjust the size by the amount you're rounding down by.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
|
|
|
|
|