Data Storage - writing simultaneously to 2 or more network filesystems

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > June 2006 > writing simultaneously to 2 or more network filesystems





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author writing simultaneously to 2 or more network filesystems
epanepucci@gmail.com

2006-06-20, 1:12 pm

Hello,

We have a computer (DAQ) that will generate 300GB/hour and we will
process the data (images of 18 - 30MB) using a cluster.

The data processing *must* occur at the same time as the "data
acquisition" so we do not want to use the same network or file-server
in order not to slow down either of the processes (data acquisition /
data processing).

My question: Is there any technology (either hardware or software
based) that will allow the DAQ computer to simultaneously write the
data to 2 or more different computers? Possibly using many network
cards, special switches...

At first i thought that maybe I could NFS mount two NFS resources to
the same mount point on the DAQ computer with some write-only option
but I could not find any docs on this scenario.

Your comments are very welcome.

Cheers,
E.Panepucci
Swiss Light Source

jpd

2006-06-20, 7:13 pm

Begin <1150807154.903426.171800@h76g2000cwa.googlegroups.com>
On 2006-06-20, epanepucci@gmail.com <epanepucci@gmail.com> wrote:
> The data processing *must* occur at the same time as the "data
> acquisition" so we do not want to use the same network or file-server
> in order not to slow down either of the processes (data acquisition /
> data processing).


You do need some form of transport between acquisition and processing.
If that isn't acceptable, you need to eliminate any intermediates, which
probably means putting both functions on the same hardware. Or perhaps
I'm misunderstanding what you mean here?


> My question: Is there any technology (either hardware or software
> based) that will allow the DAQ computer to simultaneously write the
> data to 2 or more different computers? Possibly using many network
> cards, special switches...


Of course. For example, you could write your own little program that
takes the image and sends it to two (or more) client machines. There
probably already exist solutions that do this on various levels and
in various ways. What would suit your problem I can't say, as you
are pretty sparse on the rest of the requirements, especially WRT
reliability and faillure modes.

From what you have stated, your problem looks more to be a data transfer
problem, and you're asking in groups with more or less a data storage
slant, not especially data transfer.


> At first i thought that maybe I could NFS mount two NFS resources to
> the same mount point on the DAQ computer with some write-only option
> but I could not find any docs on this scenario.


Note that nfs implies that the data will be stored somewhere, where it
then can presumably be picked up again. If that is acceptable, and you
do want to use nfs, you're thinking the wrong way around: Have the data
processors mount some nfs exported on the DAQ instead, and pick their
data up from there. The other way around nfs simply doesn't do.


--
j p d (at) d s b (dot) t u d e l f t (dot) n l .
This message was originally posted on Usenet in plain text.
Any other representation, additions, or changes do not have my
consent and may be a violation of international copyright law.
Faeandar

2006-06-21, 1:13 am

On 20 Jun 2006 05:39:14 -0700, epanepucci@gmail.com wrote:

>Hello,
>
>We have a computer (DAQ) that will generate 300GB/hour and we will
>process the data (images of 18 - 30MB) using a cluster.
>
>The data processing *must* occur at the same time as the "data
>acquisition" so we do not want to use the same network or file-server
>in order not to slow down either of the processes (data acquisition /
>data processing).
>
>My question: Is there any technology (either hardware or software
>based) that will allow the DAQ computer to simultaneously write the
>data to 2 or more different computers? Possibly using many network
>cards, special switches...
>
>At first i thought that maybe I could NFS mount two NFS resources to
>the same mount point on the DAQ computer with some write-only option
>but I could not find any docs on this scenario.
>
>Your comments are very welcome.
>
>Cheers,
> E.Panepucci
> Swiss Light Source


Sounds to me, just in passing, that a cluster file system would suit
your need. Not writing to two machines exactly but you can have
multiple file servers have access to the same data simultaneously to
serve out somewhere else.

Maybe take a look at Polyserve, Isilon, or Panasas.

~F
Bill Todd

2006-06-21, 1:13 am

epanepucci@gmail.com wrote:
> Hello,
>
> We have a computer (DAQ) that will generate 300GB/hour and we will
> process the data (images of 18 - 30MB) using a cluster.
>
> The data processing *must* occur at the same time as the "data
> acquisition" so we do not want to use the same network or file-server
> in order not to slow down either of the processes (data acquisition /
> data processing).
>
> My question: Is there any technology (either hardware or software
> based) that will allow the DAQ computer to simultaneously write the
> data to 2 or more different computers?


Sure - but what good would it do?

In order to process the data, you've got to write at least one copy of
it to the hardware which will do that processing, which hardware
therefore must be capable of both accepting the in-coming data and
processing it. Once you can do that, why do you need another computer?
And if you can't do that, it doesn't seem you can accomplish what you
want to.

However, that's if you look at it as competing processes. If you
instead look at it from the storage level, all you need is 1) storage
capable of handling the combined bandwidth of acquisition and processing
(which shouldn't present much of a challenge: a single disk comes very
close to providing enough streaming bandwidth for acquisition, so two or
more in a stripe group should provide sufficient streaming bandwidth for
acquisition plus as much processing as you care to configure by
extending the size of the stripe group) and 2) a cluster that uses
shared direct access to that storage (such as is supported by
shared-disk file systems like SANergy et al.) such that one or more
members can do the acquisition and the rest can do the processing
(perhaps at slightly lower disk-access priority such that acquisition is
guaranteed), all using the same single copy of the data (or the same
redundant data if you protect it with mirroring or parity).

If you're hesitant to step up to shared-disk technology, you could do it
with a NAS box with the same kind of striped-storage array: it
shouldn't be too difficult to configure a NAS box that will handle
several hundred MB/sec (a bit under 100 MB/sec for acquisition plus
whatever you need for processing), and this nicely segregates the
acquisition machine (which may be important if acquisition consumes a
lot of CPU) from the processing cluster (again, you might want to give
acquisition higher priority at the NAS box than processing to ensure you
didn't lose any in-coming data).

- bill
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com