Unix Programming - A question about read() from "Unix Network Programming", Vol 1

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2008 > A question about read() from "Unix Network Programming", Vol 1





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author A question about read() from "Unix Network Programming", Vol 1
K-mart Cashier

2007-12-30, 1:27 pm

The following code snippet is taken from figure 3.15 on page 89 from
the book "Advanced Unix Network Programming: The Sockets Networking
API", Vol 1, third edition, by Stevens, Fenner, and Rudoff

1 #include "unp.h"

2 ssize_t /* Read "n" bytes from a
descriptor. */
3 readn(int fd, void *vptr, size_t n)
4 {
5 size_t nleft;
6 ssize_t nread;
7 char *ptr;

8 ptr = vptr;
9 nleft = n;
10 while (nleft > 0) {
11 if ( (nread = read(fd, ptr, nleft)) < 0) {
12 if (errno == EINTR)
13 nread = 0; /* and call read() again */
14 else
15 return (-1);
16 } else if (nread == 0)
17 break; /* EOF */

18 nleft -= nread;
19 ptr += nread;
20 }
21 return (n - nleft); /* return >= 0 */
22 }


On line 13, why is nread set to zero? Ie, why not just use a goto
statement?


Chad
William Pursell

2007-12-30, 7:23 pm

On Dec 30, 7:16 pm, K-mart Cashier <cdal...@gmail.com> wrote:
> The following code snippet is taken from figure 3.15 on page 89 from
> the book "Advanced Unix Network Programming: The Sockets Networking
> API", Vol 1, third edition, by Stevens, Fenner, and Rudoff
>
> 1 #include "unp.h"
>
> 2 ssize_t /* Read "n" bytes from a
> descriptor. */
> 3 readn(int fd, void *vptr, size_t n)
> 4 {
> 5 size_t nleft;
> 6 ssize_t nread;
> 7 char *ptr;
>
> 8 ptr = vptr;
> 9 nleft = n;
> 10 while (nleft > 0) {
> 11 if ( (nread = read(fd, ptr, nleft)) < 0) {
> 12 if (errno == EINTR)
> 13 nread = 0; /* and call read() again */
> 14 else
> 15 return (-1);
> 16 } else if (nread == 0)
> 17 break; /* EOF */
>
> 18 nleft -= nread;
> 19 ptr += nread;
> 20 }
> 21 return (n - nleft); /* return >= 0 */
> 22 }
>
> On line 13, why is nread set to zero? Ie, why not just use a goto
> statement?


In this case, a goto would work but is not necessary.
It is probably avoided for stylistic reasons only.
K-mart Cashier

2007-12-30, 7:23 pm

On Dec 30, 12:40 pm, William Pursell <bill.purs...@gmail.com> wrote:
> On Dec 30, 7:16 pm, K-mart Cashier <cdal...@gmail.com> wrote:
>
>
>
>
>
>
>
>
>
> In this case, a goto would work but is not necessary.
> It is probably avoided for stylistic reasons only.



Okay, I was just curious because every once in a while when I ask
about a line of code, I'll get some kind of crazy response that is
written in academic jargon. Then when I ask for clarification, the
person will go off and cite some obscure part of the Unix Man pages.
rumplstiltzkin@gmail.com

2007-12-30, 7:24 pm

Because code with goto's gets pretty nasty and hard to read quickly.

K-mart Cashier wrote:
> The following code snippet is taken from figure 3.15 on page 89 from
> the book "Advanced Unix Network Programming: The Sockets Networking
> API", Vol 1, third edition, by Stevens, Fenner, and Rudoff
>
> 1 #include "unp.h"
>
> 2 ssize_t /* Read "n" bytes from a
> descriptor. */
> 3 readn(int fd, void *vptr, size_t n)
> 4 {
> 5 size_t nleft;
> 6 ssize_t nread;
> 7 char *ptr;
>
> 8 ptr = vptr;
> 9 nleft = n;
> 10 while (nleft > 0) {
> 11 if ( (nread = read(fd, ptr, nleft)) < 0) {
> 12 if (errno == EINTR)
> 13 nread = 0; /* and call read() again */
> 14 else
> 15 return (-1);
> 16 } else if (nread == 0)
> 17 break; /* EOF */
>
> 18 nleft -= nread;
> 19 ptr += nread;
> 20 }
> 21 return (n - nleft); /* return >= 0 */
> 22 }
>
>
> On line 13, why is nread set to zero? Ie, why not just use a goto
> statement?
>
>
> Chad

Casper H.S. Dik

2007-12-31, 7:34 am

K-mart Cashier <cdalten@gmail.com> writes:

>The following code snippet is taken from figure 3.15 on page 89 from
>the book "Advanced Unix Network Programming: The Sockets Networking
>API", Vol 1, third edition, by Stevens, Fenner, and Rudoff


>1 #include "unp.h"


> 2 ssize_t /* Read "n" bytes from a
>descriptor. */
> 3 readn(int fd, void *vptr, size_t n)
> 4 {
> 5 size_t nleft;
> 6 ssize_t nread;
> 7 char *ptr;


> 8 ptr = vptr;
> 9 nleft = n;
>10 while (nleft > 0) {
>11 if ( (nread = read(fd, ptr, nleft)) < 0) {
>12 if (errno == EINTR)
>13 nread = 0; /* and call read() again */
>14 else
>15 return (-1);
>16 } else if (nread == 0)
>17 break; /* EOF */


>18 nleft -= nread;
>19 ptr += nread;
>20 }
>21 return (n - nleft); /* return >= 0 */
>22 }



>On line 13, why is nread set to zero? Ie, why not just use a goto
>statement?


While some people bend over backwards to avoid goto's, the other option
here would be to use "continue" to reenter the loop a the top.

(The one bug in the code here is that if initial reads succeed but a later
when fails, -1 is returned and not the number of bytes read)

I think it should be something more like:

if (errno == EINTR)
continue;
else if (nleft == n)
return (-1);
else
return (n - nleft);

Casper
Rainer Weikusat

2007-12-31, 7:34 am

K-mart Cashier <cdalten@gmail.com> writes:
> The following code snippet is taken from figure 3.15 on page 89 from
> the book "Advanced Unix Network Programming: The Sockets Networking
> API", Vol 1, third edition, by Stevens, Fenner, and Rudoff
>
> 1 #include "unp.h"
>
> 2 ssize_t /* Read "n" bytes from a
> descriptor. */
> 3 readn(int fd, void *vptr, size_t n)
> 4 {
> 5 size_t nleft;
> 6 ssize_t nread;
> 7 char *ptr;
>
> 8 ptr = vptr;
> 9 nleft = n;
> 10 while (nleft > 0) {
> 11 if ( (nread = read(fd, ptr, nleft)) < 0) {
> 12 if (errno == EINTR)
> 13 nread = 0; /* and call read() again */
> 14 else
> 15 return (-1);
> 16 } else if (nread == 0)
> 17 break; /* EOF */
>
> 18 nleft -= nread;
> 19 ptr += nread;
> 20 }
> 21 return (n - nleft); /* return >= 0 */
> 22 }
>
>
> On line 13, why is nread set to zero? Ie, why not just use a goto
> statement?


Everything can be expressed in a multiplictly of different ways.
For this particular case,

ptr = vptr;
nleft = n;
while (nleft > 0) {
do
nread = read(fd, ptr, nleft);
while (nread == -1 && errno == EINTR);
if (nread <= 0) break;

nleft -= nread;
ptr += nread;
}

/* handle EOF or error */

would be the one I would be using nowadays, because it avoids playing
ugly games with state variables and 'using goto' at the same time.
A conditional jump backwards within the same block can always be
replaced by a 'proper' looping construct. Actually, I would prefer
a slight variation:

nleft = n;
ptr = vptr;
goto read;
do {
ptr += nread;

read:
do
nread = read(fd, ptr, nleft);
while (nread == -1 && errno == EINTR);
} while (nread > 0 && (nleft -= nread)):

And make it the responsibilty of the caller to only call
the function if it actually has something to do. Which is a nice of
example of a use of goto which can not be expressed
in C in a straightforward way.
Giorgos Keramidas

2007-12-31, 7:34 am

On Sun, 30 Dec 2007 11:16:23 -0800 (PST), K-mart Cashier <cdalten@gmail.com> wrote:
> The following code snippet is taken from figure 3.15 on page 89 from
> the book "Advanced Unix Network Programming: The Sockets Networking
> API", Vol 1, third edition, by Stevens, Fenner, and Rudoff
>
> 1 #include "unp.h"
>
> 2 ssize_t /* Read "n" bytes from a
> descriptor. */
> 3 readn(int fd, void *vptr, size_t n)
> 4 {
> 5 size_t nleft;
> 6 ssize_t nread;
> 7 char *ptr;
>
> 8 ptr = vptr;
> 9 nleft = n;
> 10 while (nleft > 0) {
> 11 if ( (nread = read(fd, ptr, nleft)) < 0) {
> 12 if (errno == EINTR)
> 13 nread = 0; /* and call read() again */
> 14 else
> 15 return (-1);
> 16 } else if (nread == 0)
> 17 break; /* EOF */
>
> 18 nleft -= nread;
> 19 ptr += nread;
> 20 }
> 21 return (n - nleft); /* return >= 0 */
> 22 }
>
>
> On line 13, why is nread set to zero? Ie, why not just use a goto
> statement?


So that lines 18 and 19 will have no effect on `nleft' and `ptr', and
the rest of the loop will fall back to the toplevel read() call.

I usually prefer writing a restart of read() in such cases slightly
differently:

while (nleft > 0) {
if ((nread = read(fd, ptr, nleft)) == -1) {
if (errno == EINTR)
continue;
return -1;

/* whatever */
}

William Pursell

2007-12-31, 1:23 pm

On Dec 31, 8:50 am, Casper H.S. Dik <Casper....@Sun.COM> wrote:
> K-mart Cashier <cdal...@gmail.com> writes:
<snip>[vbcol=seagreen]
> (The one bug in the code here is that if initial reads succeed but a later
> when fails, -1 is returned and not the number of bytes read)


The code works okay in the case of a successful initial
read followed by a subsequent failure. I think
Casper's mis-reading shows that it is slightly convoluted
and could be cleaned up.

Rainer Weikusat

2007-12-31, 1:23 pm

William Pursell <bill.pursell@gmail.com> writes:
> On Dec 31, 8:50 am, Casper H.S. Dik <Casper....@Sun.COM> wrote:
> <snip>
>
> The code works okay in the case of a successful initial
> read followed by a subsequent failure. I think
> Casper's mis-reading shows that it is slightly convoluted
> and could be cleaned up.


if read returns -1, the condition on line 11 will be true, hence
line 12 is next, if errno now != EINTR, the next one will be
line 15, causing -1 to be returned.

William Pursell

2007-12-31, 1:23 pm

On Dec 31, 6:13 pm, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
> William Pursell <bill.purs...@gmail.com> writes:
>
>
> if read returns -1, the condition on line 11 will be true, hence
> line 12 is next, if errno now != EINTR, the next one will be
> line 15, causing -1 to be returned.


Which is correct behavior.
Rainer Weikusat

2008-01-01, 7:36 am

William Pursell <bill.pursell@gmail.com> writes:
> On Dec 31, 6:13 pm, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
>
> Which is correct behavior.


It is a behaviour. Its correctness depends either on the context the
subroutine is used in (ie what does the caller expect) and/or how the
nominal behaviour is defined (in the given context of the book, the
answer is: It is correct). But the statement:

| if initial reads succeed but a later one fails, -1 is returned and
| not the number of bytes read)

is certainly correct, ie it wasn't the result of a 'misreading' of
something.
Casper H.S. Dik

2008-01-01, 7:36 am

William Pursell <bill.pursell@gmail.com> writes:

>On Dec 31, 8:50 am, Casper H.S. Dik <Casper....@Sun.COM> wrote:
><snip>
[vbcol=seagreen]
>The code works okay in the case of a successful initial
>read followed by a subsequent failure. I think
>Casper's mis-reading shows that it is slightly convoluted
>and could be cleaned up.


No, I don't think it does. If a subsequent read fails then it returns
-1 even though data has been read into the buffer.

The system call read(2) is supposed to return the number of bytes in that
case and I would assume that is the bhaviour this call wants to mimic.

(We're talking about read returning "nread < n" and then returning -1
with errno != EINTR); in that case it bails out using "return (-1)".

Casper
Mark Holland

2008-01-01, 7:36 am


"Casper H.S. Dik" <Casper.Dik@Sun.COM> wrote in message
news:477a34a2$0$85779$e4fe514c@news.xs4all.nl...
> William Pursell <bill.pursell@gmail.com> writes:
>
>
>
> No, I don't think it does. If a subsequent read fails then it
> returns
> -1 even though data has been read into the buffer.
>
> The system call read(2) is supposed to return the number of bytes in
> that
> case and I would assume that is the bhaviour this call wants to
> mimic.


I would quote from the book but unfortunately I don't have it with me
right now. However, the intent of the function is given in the name -
it reads N bytes from a descriptor into the given buffer. If we fail
to read all N bytes, then this is an error as indicated by the -1
return value.

Personally I don't think I would use this function in production code,
however I believe it is simply a helper function that simplifies a lot
of the code examples in the book.

Mark


Rainer Weikusat

2008-01-01, 1:28 pm

"Mark Holland" <kenshin_40@htomail.com> writes:
> "Casper H.S. Dik" <Casper.Dik@Sun.COM> wrote in message
> news:477a34a2$0$85779$e4fe514c@news.xs4all.nl...

[...]
[vbcol=seagreen]
>
> I would quote from the book but unfortunately I don't have it with me
> right now. However, the intent of the function is given in the name -
> it reads N bytes from a descriptor into the given buffer. If we fail
> to read all N bytes, then this is an error as indicated by the -1
> return value.


This description is not consistent with the code, which will return
'less than n' in case of an EOF without signalling an error. But the
only 'error handling' in APUE (IIRC) consists of printing a diagnostic
and existing and consequently, returning -1 instead of a partial
bytecount is completely 'correct' for the environment the subroutine
is supposed to be used in. The EINTR handling is a fairly useless
'modern addition': As long as an application does not define at least
one signal handler which does not terminate the process, it will never
see an EINTR, although there is the commonly held superstition that it
can appear 'of out the blue'[*]. The most sensible way to handle it
in a 'library' routine would be to return either a partial count or
-1, giving the caller a chance to recognize that the read was
interrupted and to restart it only if so desired.
William Pursell

2008-01-04, 1:38 am

On Jan 1, 12:40 pm, Casper H.S. Dik <Casper....@Sun.COM> wrote:
> William Pursell <bill.purs...@gmail.com> writes:
>
> No, I don't think it does. If a subsequent read fails then it returns
> -1 even though data has been read into the buffer.
>
> The system call read(2) is supposed to return the number of bytes in that
> case and I would assume that is the bhaviour this call wants to mimic.
>
> (We're talking about read returning "nread < n" and then returning -1
> with errno != EINTR); in that case it bails out using "return (-1)".
>


If the function is intended to always "return the number of
bytes read", then it should return 0 if an error
occurs on the first read. However, that is not the
intended behavior, nor is it the behavior of the read(2)
system call. From the read man page:

On success, the number of bytes read is returned (zero indicates end
of file)...
On error, -1 is returned

Steven's function returns the number of bytes read if there
is no error. It returns -1 if there is an error, regardless
of the number of bytes that were read. That is correct
behavior.
Rainer Weikusat

2008-01-04, 7:38 am

William Pursell <bill.pursell@gmail.com> writes:

[...]

> Steven's function returns the number of bytes read if there
> is no error. It returns -1 if there is an error, regardless
> of the number of bytes that were read. That is correct
> behavior.


'Correct behaviour' is defined as (IIRC), 'if the preconditions were
satisfied before the call and the invariants hold during execution,
the postconditions will be satisfied after it'. There is no way to
determine 'correctness' or 'incorrectness' of a particular
implementation of something, looking at this something alone.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com