Unix Programming - threads

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2005 > threads





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author threads
puzzlecracker

2005-01-20, 5:58 pm

I have few questions that I am confused about.

What is the point of threads? Why I am asking is that they are
executing in order, sequentially. Isn't a potential overhead as opposed
to run just one process? Could you give me a good example where threads
could be a good use?

Thanks.

David Schwartz

2005-01-20, 5:58 pm


"puzzlecracker" <ironsel2000@gmail.com> wrote in message
news:1106256889.990501.24530@f14g2000cwb.googlegroups.com...

> What is the point of threads? Why I am asking is that they are
> executing in order, sequentially. Isn't a potential overhead as opposed
> to run just one process? Could you give me a good example where threads
> could be a good use?


Suppose you have two processors and have a lot of computation to do.
Suppose you don't want an entire server to freeze just because of a page
fault.

DS


Fletcher Glenn

2005-01-20, 5:58 pm


"David Schwartz" <davids@webmaster.com> wrote in message
news:cspce6$a5n$1@nntp.webmaster.com...
>
> "puzzlecracker" <ironsel2000@gmail.com> wrote in message
> news:1106256889.990501.24530@f14g2000cwb.googlegroups.com...
>
>
> Suppose you have two processors and have a lot of computation to do.
> Suppose you don't want an entire server to freeze just because of a page
> fault.
>
> DS
>
>


You know, all of these questions smell like homework for a computer science
class. Most of us here know the answers because we did our homework
ourselves.

--

Fletcher Glenn


zentara

2005-01-21, 7:48 am

On 20 Jan 2005 13:34:50 -0800, "puzzlecracker" <ironsel2000@gmail.com>
wrote:

>I have few questions that I am confused about.
>
>What is the point of threads? Why I am asking is that they are
>executing in order, sequentially. Isn't a potential overhead as opposed
>to run just one process? Could you give me a good example where threads
>could be a good use?
>
>Thanks.


I use PERL mainly, and one of the uses for threads is for an easier way
to share data, rather than using IPC, in cases where you need to keep
multiple loops running concurrently. Like in GUI programming, where
you might want to download something, without locking up your GUI.

In IPC, you need to fork-and-exec, then share data through pipes or
sockets, which adds some complexity.

With threads, you just setup "shared variables", start a thread with
some code to run, and read the shared variable.

Of course, threads have a bit more overhead than "fork-and-exec",
but it's a design choice on your part.

Fork-and-exec is more efficient, is a standard method, but threads
have a place in certain kinds of programs.






--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
Måns Rullgård

2005-01-21, 7:48 am

zentara <zentara@highstream.net> writes:

> Of course, threads have a bit more overhead than "fork-and-exec",
> but it's a design choice on your part.
>
> Fork-and-exec is more efficient, is a standard method, but threads
> have a place in certain kinds of programs.


Huh? Creating a thread involves basically the same work as forking.
While running, switching between threads is potentially much faster
than switching between processes, since the page tables remain
unchanged. This also has the advantage that cache and TLB data are
valid across thread switches, which can be a great advantage.

--
Måns Rullgård
mru@inprovide.com
David Schwartz

2005-01-21, 5:52 pm


"Måns Rullgård" <mru@inprovide.com> wrote in message
news:yw1xacr3q9qw.fsf@ford.inprovide.com...

> Huh? Creating a thread involves basically the same work as forking.
> While running, switching between threads is potentially much faster
> than switching between processes, since the page tables remain
> unchanged. This also has the advantage that cache and TLB data are
> valid across thread switches, which can be a great advantage.


I think this advantage is overblown. If you're switching contexts so
often that this advantage makes a difference, your performance is in the
toilet anyway. Your threads should be using up their full timeslices so long
as there is sufficient work to do, so there shouldn't be very many thread
context switches.

DS


Måns Rullgård

2005-01-21, 5:52 pm

"David Schwartz" <davids@webmaster.com> writes:

> "Måns Rullgård" <mru@inprovide.com> wrote in message
> news:yw1xacr3q9qw.fsf@ford.inprovide.com...
>
>
> I think this advantage is overblown. If you're switching
> contexts so often that this advantage makes a difference, your
> performance is in the toilet anyway. Your threads should be using up
> their full timeslices so long as there is sufficient work to do, so
> there shouldn't be very many thread context switches.


My point was that the only difference between threads and processes is
to the threads' advantage, so I find it strange to say that threads
impose a greater overhead.

--
Måns Rullgård
mru@inprovide.com
David Schwartz

2005-01-21, 5:52 pm


"Måns Rullgård" <mru@inprovide.com> wrote in message
news:yw1x651qr1fq.fsf@ford.inprovide.com...

> My point was that the only difference between threads and processes is
> to the threads' advantage, so I find it strange to say that threads
> impose a greater overhead.


They do though. Consider, for example, two servers. One using multiple
processes and one using multiple threads. They both have a lot of work to do
and are using up their full context slices. In the one using multiple
threads, memory allocations require coordination, acquiring and releasing
mutexes with at least some serialization. In the one using processes, they
do not. The same goes for all sorts of internal details in libraries,
contention for the same file descriptor table, and so on. If this overhead
is no made up by some other benefit of threads, then multiple processes is
superior.

DS


Måns Rullgård

2005-01-21, 5:52 pm

"David Schwartz" <davids@webmaster.com> writes:

> "Måns Rullgård" <mru@inprovide.com> wrote in message
> news:yw1x651qr1fq.fsf@ford.inprovide.com...
>
>
> They do though. Consider, for example, two servers. One using multiple
> processes and one using multiple threads. They both have a lot of work to do
> and are using up their full context slices. In the one using multiple
> threads, memory allocations require coordination, acquiring and releasing
> mutexes with at least some serialization. In the one using processes, they
> do not. The same goes for all sorts of internal details in libraries,
> contention for the same file descriptor table, and so on. If this overhead
> is no made up by some other benefit of threads, then multiple processes is
> superior.


Fair enough. If the work performed by the different entities is
mostly independent, processes may be the better choice, since it also
limits the damage, should one of them decide to go crazy. On the
other hand, if a lot of (dynamic) data needs to be shared, threads may
be more efficient.

--
Måns Rullgård
mru@inprovide.com
David Schwartz

2005-01-21, 8:47 pm


"Måns Rullgård" <mru@inprovide.com> wrote in message
news:yw1x1xceqwhn.fsf@ford.inprovide.com...

> Fair enough. If the work performed by the different entities is
> mostly independent, processes may be the better choice, since it also
> limits the damage, should one of them decide to go crazy. On the
> other hand, if a lot of (dynamic) data needs to be shared, threads may
> be more efficient.


Yep. There are some jobs that are naturally suited for threads and some
are naturally suited for processes. One key issue is how much shared state
you have. The more there is, the better threads start to look.

It's too bad it's not really practical to use pools of processes the way
you use pools of threads. That would be an another interesting option that
would give you some of the benefits of both and few of the disadvantages of
either. You'd need a library that simplified things like cross-process
synchronization, objects in shared memory, and file descriptor exchange.
Thesis project anyone?

DS


zentara

2005-01-22, 7:47 am

On Fri, 21 Jan 2005 17:08:25 -0800, "David Schwartz"
<davids@webmaster.com> wrote:

> It's too bad it's not really practical to use pools of processes the way
>you use pools of threads. That would be an another interesting option that
>would give you some of the benefits of both and few of the disadvantages of
>either. You'd need a library that simplified things like cross-process
>synchronization, objects in shared memory, and file descriptor exchange.
>Thesis project anyone?


Perl has Parallel::Fork::Manager.

#!/usr/bin/perl
use warnings;
use strict;
use Parallel::ForkManager;

my @collections = qw (col1 col2 col3 col4 col5);
my $max_tasks = 3;
my $pm = new Parallel::ForkManager($max_tasks);
$|++;
my $start = time();

for my $collection (@collections) {
my $pid = $pm->start and next;
printf "Begin processing $collection at %d secs.....\n", time() -
$start;
sleep rand(10) + 2;
printf ".... $collection done at %d secs!\n", time() - $start;
$pm->finish;
}

__END__



--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
Måns Rullgård

2005-01-22, 7:47 am

zentara <zentara@highstream.net> writes:

> On Fri, 21 Jan 2005 17:08:25 -0800, "David Schwartz"
> <davids@webmaster.com> wrote:
>
>
> PERL has Parallel::Fork::Manager.


Doesn't seem to be in a standard installation.

> #!/usr/bin/perl
> use warnings;
> use strict;
> use Parallel::ForkManager;
>
> my @collections = qw (col1 col2 col3 col4 col5);
> my $max_tasks = 3;
> my $pm = new Parallel::ForkManager($max_tasks);
> $|++;
> my $start = time();
>
> for my $collection (@collections) {
> my $pid = $pm->start and next;
> printf "Begin processing $collection at %d secs.....\n", time() -
> $start;
> sleep rand(10) + 2;
> printf ".... $collection done at %d secs!\n", time() - $start;
> $pm->finish;
> }


I haven't checked, but I'm pretty certain that thing does a fork each
time you call it. It's rather simple to keep a counter of how many
processes have been forked, and wait() for some to finish before
starting more when the limit has been reached.

I think David was wishing for an easy to use library that pre-forked a
number of processes, and reused the same processes, much like Apache
httpd does.

--
Måns Rullgård
mru@inprovide.com
Rich Teer

2005-01-22, 8:48 pm

On Fri, 21 Jan 2005, zentara wrote:

> Of course, threads have a bit more overhead than "fork-and-exec",
> but it's a design choice on your part.
>
> Fork-and-exec is more efficient, is a standard method, but threads
> have a place in certain kinds of programs.


Methinks you've been reading ESR's "The Art of UNIX Programming"
too much...

--
Rich Teer, SCNA, SCSA, author of "Solaris Systems Programming"

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich
zentara

2005-01-23, 7:47 am

On Sun, 23 Jan 2005 00:35:24 GMT, Rich Teer <rich.teer@rite-group.com>
wrote:

>On Fri, 21 Jan 2005, zentara wrote:
>
>
>Methinks you've been reading ESR's "The Art of UNIX Programming"
>too much...


Nope, (probably need to though :-) ) It's from learning to run threads
with Perl, which parallels the problems which C faces.

I see PERL as an easy-2-use front end to C. I'm not trying to draw
flames. :-)



--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
zentara

2005-01-23, 7:47 am

On Sat, 22 Jan 2005 11:33:12 +0100, Måns Rullgård <mru@inprovide.com>
wrote:

>zentara <zentara@highstream.net> writes:
>
>
>Doesn't seem to be in a standard installation.


>I haven't checked, but I'm pretty certain that thing does a fork each
>time you call it. It's rather simple to keep a counter of how many
>processes have been forked, and wait() for some to finish before
>starting more when the limit has been reached.
>
>I think David was wishing for an easy to use library that pre-forked a
>number of processes, and reused the same processes, much like Apache
>httpd does.


Yeah, you are right. But PERL to the rescue. :-)

I would use IPC::Open3 to open a set of stdin,stdout,stderr
pipes to 'bash' or whatever shell you want, then just send
commands to it to run. The fork'd processes would be reusable
and under the control of the parent process( i.e. kill $pid ).
This is just a simple example, (I've combined the stderr and stdout
to avoid needing to use select). I could have forked more than 1
and I could change the command to send to the bash interpreter
for each refresh.)

I know you fellows prefer to discuss C, but PERL does make alot
of difficult things easy.

#!/usr/bin/perl
use warnings;
use strict;
use IPC::Open3;
use Tk;

$|=1;
my $pid=open3(\*IN,\*OUT,0,'/bin/bash');

my $mw=new MainWindow;
$mw->geometry("600x400");

my $t=$mw->Scrolled('Text',-width => 80,
-height => 80,
)->pack;

&refresh;

$mw->fileevent(\*OUT,'readable',\&write_t);
my $id = Tk::After->new($mw,2000,'repeat',\&refresh);

MainLoop;

sub refresh{
print IN "top b n 1";
print IN "\n"; #absolutely needed and on separate line
}

sub write_t {
my $str= <OUT>;
$t->insert("1.0",$str);
# $t->see("0.0");
}
__END__





--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
Måns Rullgård

2005-01-23, 7:47 am

zentara <zentara@highstream.net> writes:

> On Sat, 22 Jan 2005 11:33:12 +0100, Måns Rullgård <mru@inprovide.com>
> wrote:
>
>
>
> Yeah, you are right. But PERL to the rescue. :-)
>
> I would use IPC::Open3 to open a set of stdin,stdout,stderr
> pipes to 'bash' or whatever shell you want, then just send
> commands to it to run. The fork'd processes would be reusable
> and under the control of the parent process( i.e. kill $pid ).


It's the nitty-gritty details of doing the forking and signaling that
could be put away into a library presenting a nicer interface. You'd
simply ask for a handle of some kind to a process, and the library
would take one from the pool, either waiting or creating a new process
if the pool is empty. The library could then provide facilities to
simplify sending commands to the other process. I'm not aware of any
Perl module implementing this kind of functionality, but then I
haven't checked them all.

> I know you fellows prefer to discuss C, but PERL does make alot
> of difficult things easy.


I use PERL a bit, and it certainly simplifies many things. Some tasks
are still best done in C, though.

--
Måns Rullgård
mru@inprovide.com
David Schwartz

2005-01-23, 5:50 pm


"zentara" <zentara@highstream.net> wrote in message
news:6f57v0tpamvnnbfqspumk1317nh008757o@
4ax.com...


Yes, along with easy ways to pass file descriptors back and forth and
create objects in memory shared by the processes. You'd also need a way to
lock those objects.
[vbcol=seagreen]
> I would use IPC::Open3 to open a set of stdin,stdout,stderr
> pipes to 'bash' or whatever shell you want, then just send
> commands to it to run. The fork'd processes would be reusable
> and under the control of the parent process( i.e. kill $pid ).
> This is just a simple example, (I've combined the stderr and stdout
> to avoid needing to use select). I could have forked more than 1
> and I could change the command to send to the bash interpreter
> for each refresh.)


It would be very hard to create a symmetric arrangement between the
processes. If you want a true process pool, that's what you need. Perhaps
you could have one 'manager process' that creates the child processes and
coordinates comunication between them. I'd keep a job queue in the manager
process as well as information about what data has not been synched. Then
each process, when it finishes a job it's working on, syncs to the master,
then takes a new job. This guarantees that when a process takes a job, it at
least has everything that was in place before the job was queued.

The problem with this scheme would likely be a lot of process context
switches between the master process and one of the workers. It's not an easy
problem to solve.

DS


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com