Unix Programming - Partial writes with writev() on TCP sockets

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > August 2007 > Partial writes with writev() on TCP sockets





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Partial writes with writev() on TCP sockets
Olivier Langlois

2007-08-24, 1:24 pm

In the book Unix network programming vol. 1, second edition, it is
written that writev() is an atomic operation. My understanding of what
an atomic operation is, is that it either writes all the data or it
does not write anything but then if it is the case, why is the
function returning the # of bytes written?

By doing a small search on this newsgroup, I have seen that some
people were suggesting that writev() could perform a partial write so
I am seeking help to clarify this point and perhaps find the answers
of the following questions:

1- Is the writev() behavior regarding partial writes is OS dependant
(Some OSes guarante atomic operation, some don't)?
2- If writev() does partial writes, then in which conditions can this
happen? (TCP or UDP, blocking or non-blocking, it depends on the free
space available in the TCP transmission buffer, etc...)
3- If writev() does partial writes, then what atomic operation means?

Thank you very much!
Olivier Langlois
http://www.olivierlanglois.net

Olivier Langlois

2007-08-24, 1:24 pm

Ok, I have found the answer from the ACE framework group. For
completeness in case someone else search this group for the same
question, here is what Douglas C. Schmidt has replied to my question:

> By looking at the ACE::sendv_n_i() code, I have found out that
> writev() can perform partial writes. I am surprised of that behavior
> because in the book Unix network programming vol. 1, second edition,
> it is written that writev() is an atomic operation.


I don't think that's correct. Please see the official documentation
for
writev() at

http://www.opengroup.org/onlinepubs...ons/writev.html

and look at this statement:

"Upon successful completion, writev() shall return the number of bytes
actually written. Otherwise, it shall return a value of -1, the file-
pointer shall remain unchanged, and errno shall be set to indicate an
error."

....

what are the conditions
> with a non blocking stream socket that could result in a writev()
> partial write?


I suspect it's the usual suspects, e.g, a flow controlled connection
that leads to "short writes". Please see

http://www.linuxjournal.com/comment/reply/2333

for a summary of why short writes can occur.

Thanks,

Doug

BTW, if you are looking for how to handle partial writes with writev,
look at the ACE::sendv_n_i() function definition in the ACE framework
source code.

Greetings,
Olivier Langlois
http://www.olivierlanglois.net

Frank Cusack

2007-08-24, 1:24 pm

On Fri, 24 Aug 2007 13:46:15 -0000 Olivier Langlois <olanglois@sympatico.ca> wrote:
> In the book Unix network programming vol. 1, second edition, it is
> written that writev() is an atomic operation. My understanding of what
> an atomic operation is, is that it either writes all the data or it
> does not write anything but then if it is the case, why is the
> function returning the # of bytes written?


My guess would be that it means it will not write each iov as a separate
write, which might be interleaved with other writes. For the number
of bytes that are written, they are guaranteed to be written atomically.

-frank
Ivan Gotovchits

2007-08-27, 7:23 am

Olivier Langlois wrote:

> In the book Unix network programming vol. 1, second edition, it is
> written that writev() is an atomic operation. My understanding of what
> an atomic operation is, is that it either writes all the data or it
> does not write anything but then if it is the case, why is the
> function returning the # of bytes written?
>
> By doing a small search on this newsgroup, I have seen that some
> people were suggesting that writev() could perform a partial write so
> I am seeking help to clarify this point and perhaps find the answers
> of the following questions:
>
> 1- Is the writev() behavior regarding partial writes is OS dependant
> (Some OSes guarante atomic operation, some don't)?
> 2- If writev() does partial writes, then in which conditions can this
> happen? (TCP or UDP, blocking or non-blocking, it depends on the free
> space available in the TCP transmission buffer, etc...)
> 3- If writev() does partial writes, then what atomic operation means?
>
> Thank you very much!
> Olivier Langlois
> http://www.olivierlanglois.net

cite from the SUSv3:
The writev() function shall always write a complete area before proceeding
to the next.
end of cite.
1. This means that `writev' would not start to write next vector until it
finished with the previous.
2. Nothing said that it _must_ write the vector fully.

Next, atomically is not a very good word to describe a writing in a one
system call, because in systems with preemtive kernel you can write with
one syscall, but this syscall can be interupted by some other process.
(that will absolutely transparent to the user application).

David Schwartz

2007-08-27, 7:22 pm

On Aug 24, 6:46 am, Olivier Langlois <olangl...@sympatico.ca> wrote:

> In the book Unix network programming vol. 1, second edition, it is
> written that writev() is an atomic operation. My understanding of what
> an atomic operation is, is that it either writes all the data or it
> does not write anything but then if it is the case, why is the
> function returning the # of bytes written?


Describing 'writev' as atomic is, at best, misleading.

> By doing a small search on this newsgroup, I have seen that some
> people were suggesting that writev() could perform a partial write so
> I am seeking help to clarify this point and perhaps find the answers
> of the following questions:


It is absolutely 100% obvious that it must be possible to get a
partial write. What else should happen if, for example, half the data
is queued and then a signal causes the 'writev' to stop?

> 1- Is the writev() behavior regarding partial writes is OS dependant
> (Some OSes guarante atomic operation, some don't)?


As far as I know, no OSes guarantee that 'writev' will either fully
complete or send no data. I don't see how this could be possible in
the case of TCP.

> 2- If writev() does partial writes, then in which conditions can this
> happen? (TCP or UDP, blocking or non-blocking, it depends on the free
> space available in the TCP transmission buffer, etc...)


For UDP, there can't be a partial write. You either send a datagram or
you don't. For TCP, there can be a partial write for several reasons:

1) The operation is interrupted by a signal with the kernel's send
buffer full.

2) An error prevents further sending after some data is sent and the
OS wants to tell you how many bytes were sent.

3) The socket is non-blocking and some of the data could be put on the
send queue but not all of it.

> 3- If writev() does partial writes, then what atomic operation means?


I think it means that the data from a call to 'writev' will not be
interleaved with data from another call to 'write' or 'writev' when
the target file descriptor is a regular file.

DS

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com