Unix Programming - select( ) and its employment

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > June 2005 > select( ) and its employment





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author select( ) and its employment
jimjim

2005-06-12, 5:51 pm

Hello,

I ve been asked during an interview I attended how I would implement
a system that aims to distribute stock market data. My answer was that I
should:

- maintain a table (perhaps a hashtable) that keeps a mapping between the
stocks and the physical addresses of the clients that registered interest in
these stocks
- spawn a number of long-lived worker threads (definately more than the
processors of the server so to keep them busy even when some threads block)
- queue the incoming stock market data updates to a mutex protected linklist
- extract an update from the top of the queue and distribute it with the aid
of the threads to the clients registered for the particular stock

However, I ve been told that the most efficient way to implement
such a system is with the aid of select( ) in order to combat the problem of
the synchronous communication.

1. In production systems that aim to distribute stock market data, what
transport protocol is used (TCP,UDP) and why? (I would answer UDP (perhaps
build reliability if needed (?) on top) as stock updates are not frequent
and therefore there is no need to establish and maintain connections).
2. Why is the communication between the server and the clients synchronous?
(Should I assume it is because reliability is a requirement, thus after
write( )'ing an stock update to the socket you need to block and read( ) for
a reply?Or, in the case TCP is the actual protocol in use it is because of
flow control?)
3. Is reliability a requirement?
4. Given that incoming stock market data updates are queued to a mutex
protected linklist (?), should I consider having an "epoch" in which I make
sure to send an update to all clients, or is it more efficient to multiplex
somehow different updates? However, how can I design my system to be able to
cope with the situation that a new update for a particular stock has arrived
thus I can stop sending the older update for that stock to clients?

Responsible answers plz. Thanks in advance.


David Schwartz

2005-06-13, 2:48 am


"jimjim" <netuser@blueyonder.co.uk> wrote in message
news:7N2re.49564$G8.20627@text.news.blueyonder.co.uk...

> 1. In production systems that aim to distribute stock market data, what
> transport protocol is used (TCP,UDP) and why? (I would answer UDP (perhaps
> build reliability if needed (?) on top) as stock updates are not frequent
> and therefore there is no need to establish and maintain connections).


Either TCP or UDP. With UDP, you need a layer on top of it to make it
reliable.

> 2. Why is the communication between the server and the clients
> synchronous? (Should I assume it is because reliability is a requirement,
> thus after write( )'ing an stock update to the socket you need to block
> and read( ) for a reply?Or, in the case TCP is the actual protocol in use
> it is because of flow control?)


You have not made it clear what you mean by "synchronous".

> 3. Is reliability a requirement?


You are asking *us* what *your* requirements are?!

> 4. Given that incoming stock market data updates are queued to a mutex
> protected linklist (?), should I consider having an "epoch" in which I
> make sure to send an update to all clients, or is it more efficient to
> multiplex somehow different updates? However, how can I design my system
> to be able to cope with the situation that a new update for a particular
> stock has arrived thus I can stop sending the older update for that stock
> to clients?


Just keep a list of what updates each client needs. The update should
not contain the stock price, but the stock name. That way, the update will
automatically send the latest price. This assumes that history information
is not important. It well might be.

DS


jimjim

2005-06-14, 7:53 am

> Either TCP or UDP. With UDP, you need a layer on top of it to make it
> reliable.
>

What are the critiria for choosing one of these in the context of the
problem I have described in my original post? (bare in mind that I already
know the relative differences of the two protocols, so a plain outline of
the differences wont help)

> You have not made it clear what you mean by "synchronous".
>

Well, this is what I was told. So, is there somebody who can interpret the
use of this term in the context of the problem I have described in my
original post?

> You are asking *us* what *your* requirements are?!
>

In the context of the problem I have described in my original post, is
reliability a requirement?

> Just keep a list of what updates each client needs. The update should
> not contain the stock price, but the stock name. That way, the update will
> automatically send the latest price. This assumes that history information
> is not important. It well might be.
>

I think you disregard the essence of timeliness of delivery of a particular
update to all interested clients. This would have been apparent if you knew
about the problem I have described in my original post.

I hope someone who has worked on something similar could be able to help.

TIA


David Schwartz

2005-06-14, 5:57 pm


"jimjim" <netuser@blueyonder.co.uk> wrote in message
news:zqxre.50207$G8.20907@text.news.blueyonder.co.uk...

[vbcol=seagreen]
> What are the critiria for choosing one of these in the context of the
> problem I have described in my original post? (bare in mind that I already
> know the relative differences of the two protocols, so a plain outline of
> the differences wont help)


If you need to operate in environments where UDP communication may not
work (for example, because of firewalls and/or proxies), you should at least
support TCP as an option. If you don't have the inclication or
sophistication to layer a proper protocol on top of UDP, choose TCP. If you
need all the features TCP provides (congestion control, duplicate detection,
retransmission, and so on), use TCP.

> Well, this is what I was told. So, is there somebody who can interpret the
> use of this term in the context of the problem I have described in my
> original post?


The word could mean almost anything.

> In the context of the problem I have described in my original post, is
> reliability a requirement?


This is another vague question. Do you mean "reliability" in a technical
sense (like the way TCP is reliable), or do you mean it has to work as
designed?

[vbcol=seagreen]
> I think you disregard the essence of timeliness of delivery of a
> particular update to all interested clients. This would have been apparent
> if you knew about the problem I have described in my original post.
>
> I hope someone who has worked on something similar could be able to help.


Your description is *so* vague, nobody could tell if their application
is similar or not. I worked on a problem that I think is quite similar.

DS


jimjim

2005-06-14, 5:57 pm

Hello David,

> If you need to operate in environments where UDP communication may not
> work (for example, because of firewalls and/or proxies)
>

Yeah, didnt think of it! UDP may be filtered by firewalls. Thats definatelly
one selection critirion. thx


What I am asking for here is reverse enginineering. I am trying to figure
out the way the company has gone about to implement the system in question.

As I wrote in my original post, it is given (was told during my interview)
that the system employs select( ) in order to avoid blocking due to the
synchronous communication. My assessment is that TCP is used as the
transport protocol, which because it is flow controled there is the case of
blocking when write( )'ing data to the socket (which is what I believe they
may call "synchronous communication"). So, select( ) is used in order to
address the problem of "synchronous communication". OR, if UDP is actually
used, we may be blocking on the read( ) used for implementing reliability on
top of UDP and therefore select( ) has to be used.

So David, I am trying to reverse engineer their system on the basis of the
few info I have. I posted this question in hope to receive the insight of
someone who has either worked on sth similar or knows what I am asking.

TIA



David Schwartz

2005-06-15, 6:10 pm


"jimjim" <netuser@blueyonder.co.uk> wrote in message
news:7YFre.50438$G8.38894@text.news.blueyonder.co.uk...

> Hello David,


[vbcol=seagreen]
> Yeah, didnt think of it! UDP may be filtered by firewalls. Thats
> definatelly one selection critirion. thx


If you use/prefer UDP, fallback to TCP is often required.

> As I wrote in my original post, it is given (was told during my interview)
> that the system employs select( ) in order to avoid blocking due to the
> synchronous communication. My assessment is that TCP is used as the
> transport protocol, which because it is flow controled there is the case
> of blocking when write( )'ing data to the socket (which is what I believe
> they may call "synchronous communication"). So, select( ) is used in order
> to address the problem of "synchronous communication". OR, if UDP is
> actually used, we may be blocking on the read( ) used for implementing
> reliability on top of UDP and therefore select( ) has to be used.


Either way, 'select' could well be used to tell when data has been
received.

> So David, I am trying to reverse engineer their system on the basis of the
> few info I have. I posted this question in hope to receive the insight of
> someone who has either worked on sth similar or knows what I am asking.


Another reason UDP may be preferred is that it's easy to sustain 20,000
or greater "connections" with UDP because there really aren't connections.
With TCP, sustaining tens of thousands of connections requires juggling tens
of thousands of sockets.

DS


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com