Unix Programming - Pipes and fd question. Large amounts of data.

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2005 > Pipes and fd question. Large amounts of data.





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Pipes and fd question. Large amounts of data.
Oded Shimon

2005-01-30, 7:50 am

I have a rather unique situation. I have 2 programs, neither of which have
control over.
Program A writes into TWO fifo's.
Program B reads from two fifo's.

My program is the middle step.

The problem - neither programs are aware of each other, and write into any
of
the fifo's at their own free will. They will also block until whatever data
moving they did is complete.

Meaning, if I were to use the direct approach and have no middle step, the
programs would be thrown into a deadlock instantly. as one program will
write
info fifo 1, and the other will be reading from fifo 2.

The amounts of data is very large, GB's of data in total, and at least 10mb
a
second or possibly as much as 300mb a second. So efficiency in context
switching is very important.

programs A & B both write and read using large chunks, usually 300k.

So far, my solution is using select() and non blocking pipes. I also used
large buffers (20mb). In my measurements, at worst case the programs
write/read 6mb before switching to the other fifo. so 20mb is safe enough.

I have implemented this, but it has a major disadvantage - every 'write()'
only write 4k at a time, never more, because of how non-blocking pipes are
done. at 20,000 context switches a second, this method reaches barely 10mb a
second, if not less.

Blocking pipes have an advantage - they can write large chunks at a time.
They
have a more serious disadvantage though - the amount of data you ask to be
written/read, IS the amount of data that will be written or read, and will
block until that much data is moved. I cannot know beforehand exactly how
much data the programs want, so this could easily fall into a dead lock.

Ideally, I could do this:
my program: write(20mb);
program B: read(300k);
my program: write() returns with return value '300,000'

I was unable to find anything like this solution or similar.
No combination of blocking/non blocking fd's will give this, or any system
call.
I am looking for alternative/better suggestions.

- ods15.

Barry Margolin

2005-01-30, 5:51 pm

In article <41fca6c2@news.012.net.il>,
Oded Shimon <ods15@ods15.dyndns.org> wrote:

> I have a rather unique situation. I have 2 programs, neither of which have
> control over.
> Program A writes into TWO fifo's.
> Program B reads from two fifo's.
>
> My program is the middle step.
>
> The problem - neither programs are aware of each other, and write into any
> of
> the fifo's at their own free will. They will also block until whatever data
> moving they did is complete.
>
> Meaning, if I were to use the direct approach and have no middle step, the
> programs would be thrown into a deadlock instantly. as one program will
> write
> info fifo 1, and the other will be reading from fifo 2.


This is incredibly poor design of those programs. Why would they write
the programs in such a way that you have to write this intermediate
program to prevent deadlock? Are you supposed to be writing to files,
and have chosen to use fifos instead?

>
> The amounts of data is very large, GB's of data in total, and at least 10mb
> a
> second or possibly as much as 300mb a second. So efficiency in context
> switching is very important.
>
> programs A & B both write and read using large chunks, usually 300k.
>
> So far, my solution is using select() and non blocking pipes. I also used
> large buffers (20mb). In my measurements, at worst case the programs
> write/read 6mb before switching to the other fifo. so 20mb is safe enough.
>
> I have implemented this, but it has a major disadvantage - every 'write()'
> only write 4k at a time, never more, because of how non-blocking pipes are
> done. at 20,000 context switches a second, this method reaches barely 10mb a
> second, if not less.
>
> Blocking pipes have an advantage - they can write large chunks at a time.
> They
> have a more serious disadvantage though - the amount of data you ask to be
> written/read, IS the amount of data that will be written or read, and will
> block until that much data is moved. I cannot know beforehand exactly how
> much data the programs want, so this could easily fall into a dead lock.


Blocking read shouldn't wait for all the data you ask for. It should
return whenever something is available.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
DINH Viet Hoa

2005-01-30, 5:51 pm

Oded Shimon wrote :

> I was unable to find anything like this solution or similar.
> No combination of blocking/non blocking fd's will give this, or any system
> call.
> I am looking for alternative/better suggestions.


you could use a shared memory block and use the pipe only to signal
that a big chunk has been written and must be read by the other
program.

--
DINH V. Hoa,

"un joint tu vas pas avoir ton cerveau détruit à la longue" -- b.

Barry Margolin

2005-01-30, 5:51 pm

In article <etPan.41fcf014.251a40c0.5d9b@utopia>,
DINH Viet Hoa <dinh.viet.hoa@free.fr> wrote:

> Oded Shimon wrote :
>
>
> you could use a shared memory block and use the pipe only to signal
> that a big chunk has been written and must be read by the other
> program.


Did you miss the part where he said he couldn't modify the reader and
writer programs? All he can do is insert an intermediary.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
DINH Viet Hoa

2005-01-30, 5:51 pm

Barry Margolin wrote :

>
> Did you miss the part where he said he couldn't modify the reader and
> writer programs? All he can do is insert an intermediary.


It seems that I missed that part, sorry ;)

--
DINH V. Hoa,

"un joint tu vas pas avoir ton cerveau détruit à la longue" -- b.

Gianni Mariani

2005-01-30, 5:51 pm

Oded Shimon wrote:
> I have a rather unique situation. I have 2 programs, neither of which have
> control over.
> Program A writes into TWO fifo's.
> Program B reads from two fifo's.
>
> My program is the middle step.


Use 4 threads. A reader thread that reads into a large memory buffer and
a writer that writes from this buffer into the outgoing pipe for both pipes.

> I am looking for alternative/better suggestions.

Johan

2005-01-30, 5:51 pm

why not use sockets

John

"Oded Shimon" <ods15@ods15.dyndns.org> schreef in bericht
news:41fca6c2@news.012.net.il...
>I have a rather unique situation. I have 2 programs, neither of which
>have
> control over.
> Program A writes into TWO fifo's.
> Program B reads from two fifo's.
>
> My program is the middle step.
>
> The problem - neither programs are aware of each other, and write into any
> of
> the fifo's at their own free will. They will also block until whatever
> data
> moving they did is complete.
>
> Meaning, if I were to use the direct approach and have no middle step, the
> programs would be thrown into a deadlock instantly. as one program will
> write
> info fifo 1, and the other will be reading from fifo 2.
>
> The amounts of data is very large, GB's of data in total, and at least
> 10mb
> a
> second or possibly as much as 300mb a second. So efficiency in context
> switching is very important.
>
> programs A & B both write and read using large chunks, usually 300k.
>
> So far, my solution is using select() and non blocking pipes. I also used
> large buffers (20mb). In my measurements, at worst case the programs
> write/read 6mb before switching to the other fifo. so 20mb is safe enough.
>
> I have implemented this, but it has a major disadvantage - every 'write()'
> only write 4k at a time, never more, because of how non-blocking pipes are
> done. at 20,000 context switches a second, this method reaches barely 10mb
> a
> second, if not less.
>
> Blocking pipes have an advantage - they can write large chunks at a time.
> They
> have a more serious disadvantage though - the amount of data you ask to be
> written/read, IS the amount of data that will be written or read, and will
> block until that much data is moved. I cannot know beforehand exactly how
> much data the programs want, so this could easily fall into a dead lock.
>
> Ideally, I could do this:
> my program: write(20mb);
> program B: read(300k);
> my program: write() returns with return value '300,000'
>
> I was unable to find anything like this solution or similar.
> No combination of blocking/non blocking fd's will give this, or any system
> call.
> I am looking for alternative/better suggestions.
>
> - ods15.
>



Paul Sheer

2005-01-30, 5:51 pm

On Sun, 30 Jan 2005 11:20:02 +0200, Oded Shimon wrote:

> I have a rather unique situation. [...]


type
man select_tut
on a linux system or lookup this man page
on google

it will explain all about non-blocking I/O

-paul

Barry Margolin

2005-01-31, 2:55 am

In article <10vq0pkf5ugou99@corp.supernews.com>,
"Johan" <me@knoware.nl> wrote:

> why not use sockets


You can't change from pipes to sockets without modifying the programs,
since you have to use different system calls to open them: socket() and
connect() in the writer; socket(), bind(), listen(), and accept() in the
reader. The programs presumably currently use open().
[vbcol=seagreen]
>
> John
>
> "Oded Shimon" <ods15@ods15.dyndns.org> schreef in bericht
> news:41fca6c2@news.012.net.il...

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Claeys Marc

2005-01-31, 5:55 pm

Barry Margolin wrote:

> In article <10vq0pkf5ugou99@corp.supernews.com>,
> "Johan" <me@knoware.nl> wrote:
>
>
>
>
> You can't change from pipes to sockets without modifying the programs,
> since you have to use different system calls to open them: socket() and
> connect() in the writer; socket(), bind(), listen(), and accept() in the
> reader. The programs presumably currently use open().


what about a preload that has a version of open that checks if the 1st
arg is a named pipe, then translates and otherwise calls the dl.. set.



>
>
>
>


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com