recv() sends extra garbage bytes
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Unix and Linux reviews > Free Unix support > Unix Programming > recv() sends extra garbage bytes




Pages (2): [1] 2 »   Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    recv() sends extra garbage bytes  
chsalvia@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 12:16 PM

I have a C program running on Linux/UNIX that sends data over a
network using the SOCK_STREAM protocol.  One computer sends and
another receives.   The incoming data is prefixed with a 4 byte
integer containing the size of the message.

Problem: I've noticed that very rarely, the receiving computer will
receive additional garbage bytes in the receive buffer.  I'm not
exactly sure why this happens.  To fix this, I used a decreasing
receive buffer size on each call, to match the remaining bytes on the
message, rather than the usual fixed "buffersize" value I normally
pass to receive.   But I've never seen this technique done anywhere
else.  Usually, people just specify some fixed buffer size value for
the recv() function, and the recv function never sends any more data
than it receives from the socket anyway.  But if I do this, I receive
garbage data sometimes.

Here is what I'm doing:

int recv(int new_socket, void* buffer)
{
uint32_t recv_size;

/* Read 4 bytes to obtain size of recv'd data */
switch (recv(new_socket, buffer, sizeof(uint32_t), 0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: memcpy(&recv_size, buffer, sizeof(uint32_t));
}

uint32_t rem_bytes = recv_size - sizeof(uint32_t);
uint32_t recvd_bytes;

/* Receive rest of the data */
do {
switch (recvd_bytes = .recv(new_socket, buffer, rem_bytes,
0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
rem_bytes -= recvd_bytes;
} while (rem_bytes != 0);
}

The literature on recv indicates that you should pass a constant value
to the recv() function to indicate the maximum number of bytes to
receive, like:

recv(sock, buffer, MAX_BYTES, 0);

...rather than the decreasing rem_bytes value I used above, since recv
is supposed to only receive exactly what was sent and nothing more,
regardless of what value you pass as MAX_BYTES.  Plus, I assume that
even with the above method, the undesirable garbage bytes will stay in
the stream and simply be returned the next time I call recv().

So why am I receiving extra garbage data with recv()?  Is this just
something that happens from time to time over networks?






[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
Frank Cusack


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 12:16 PM

On 26 Mar 2007 23:56:41 -0700 chsalvia@gmail.com wrote:
> int recv(int new_socket, void* buffer)
> {
>       uint32_t recv_size;
>
>       /* Read 4 bytes to obtain size of recv'd data */
>       switch (recv(new_socket, buffer, sizeof(uint32_t), 0)) {

1. I didn't think you could name your function recv and then also call
the system recv.  But looking at code below, I guess this is C++.
2. I would have to see the rest of the code, but buffer may not be
aligned properly.

>             case -1: return SOCKET_ERR;
>             case 0: return CLOSE_SOCKET;
>             default: memcpy(&recv_size, buffer, sizeof(uint32_t));

You are not guaranteed to have received 4 bytes of data (since new_sock
is a SOCK_STREAM socket).

>       }
>
>       uint32_t rem_bytes = recv_size - sizeof(uint32_t);

If you did not receive 4 bytes, you just hosed yourself.

>       uint32_t recvd_bytes;
>
>       /* Receive rest of the data */
>       do {
>             switch (recvd_bytes = .recv(new_socket, buffer, rem_bytes,

Looks like a typo here (.recv).  It's always best to copy&paste the
exact code you are using (or an exact test case) for best answers.

I'd be worried that rem_bytes may be larger than buffer can hold, first
because you don't pass the size of buffer to compare it against, and
second because of the bug above.

> 0)) {
>                   case -1: return SOCKET_ERR;
>                   case 0: return CLOSE_SOCKET;
>                   default: break;
>             }
>             rem_bytes -= recvd_bytes;
>       } while (rem_bytes != 0);
> }
>
> The literature on recv indicates that you should pass a constant value
> to the recv() function to indicate the maximum number of bytes to
> receive, like:
>
> recv(sock, buffer, MAX_BYTES, 0);
>
> ...rather than the decreasing rem_bytes value I used above, since recv
> is supposed to only receive exactly what was sent and nothing more,
> regardless of what value you pass as MAX_BYTES.

Not for SOCK_STREAM.  For SOCK_STREAM sockets, what is sent is a byte
stream without record boundaries so depending on pacing, yes you will
never receive more than what was sent, but the definition of "what was
sent" is somewhat loose.

>  Plus, I assume that
> even with the above method, the undesirable garbage bytes will stay in
> the stream and simply be returned the next time I call recv().
>
> So why am I receiving extra garbage data with recv()?  Is this just
> something that happens from time to time over networks?

You have at least 2 bugs.  You are not checking that you received 4
bytes initially, and you are not checking that the length of the
message will fit in your buffer.  Otherwise there might be a bug on
the sender's side.

-frank





[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
chsalvia@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 12:16 PM

> You are not guaranteed to have received 4 bytes of data (since new_sock
> is a SOCK_STREAM socket).
> 
> 
>
> If you did not receive 4 bytes, you just hosed yourself.

Good point.  I overlooked the possibility that the initial recv might
get less than 4 bytes.  I changed it so that the initial receive loops
until it gets 4 bytes.

> I'd be worried that rem_bytes may be larger than buffer can hold, first
> because you don't pass the size of buffer to compare it against, and
> second because of the bug above.

As you noticed above, I didn't copy and paste the exact code.  The
reason I didn't was because my actual code uses a customized C++
buffer class that handles memory allocation.  I tried to simplify my
code for the purpose of posting it here - perhaps that was a bad
idea.  The loop is actually like this:

do {
switch (buffer.recv(new_socket, recv_size)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
} while (buffer.used_bytes < recv_size);

With buffer.recv() defined as:

inline int recv(int m_sock, uint32_t maxrecv, int flags = 0) {
if (used_bytes + maxrecv > size) realloc_buffer(used_bytes +
maxrecv);
int status = ::recv(m_sock, (byte*) data + used_bytes, maxrecv,
flags);
if (status > 0) used_bytes += status;
return status;
}

And of course, the member variable used_bytes keeps track of the
buffer length, and size keeps track of allocated space.

> You have at least 2 bugs.  You are not checking that you received 4
> bytes initially, and you are not checking that the length of the
> message will fit in your buffer.  Otherwise there might be a bug on
> the sender's side.

Thanks for pointing out that out.  Before you said that "what is
actually sent" is loosely defined.  Is there any possible way that a
server could recv() *more* data than a client sends?  In my case, I
notice that every now and then, I'll receive an additional 1000 bytes
or so of garbage data on the tail end of the message.  (Typical
message length is approx 300K.)

This problem doesn't seem like it could be related to the first bug
you pointed out, because even if I received less than the full 4 bytes
and then erroneously assigned the incomplete value to recv_size, this
would either result in me receiving *less* bytes than I expected,
(because a 4 byte integer containing a value like 300000 for example
would be if only 1 byte or 2 bytes were received), OR, if recv_size
contained some initial garbage value in it, it would be set to some
arbitrary high value, in which case I would recv() the full amount
anyway.  This doesn't explain why I'd receive an additional ~1000
bytes of garbage data.

Is there any reason why recv() might get more data than was actually
sent by send()?










[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
Frank Cusack


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 12:16 PM

On 27 Mar 2007 00:47:48 -0700 chsalvia@gmail.com wrote:
> Thanks for pointing out that out.  Before you said that "what is
> actually sent" is loosely defined.  Is there any possible way that a
> server could recv() *more* data than a client sends?

Not for SOCK_STREAM.

By "loosely defined" I just mean that while you may do a recv for 1
"message" the other side may have sent multiple messages.  If your
protocol operates in lock step this won't happen.

>  In my case, I notice that every now and then, I'll receive an
> additional 1000 bytes or so of garbage data on the tail end of the
> message.  (Typical message length is approx 300K.)
>
> This problem doesn't seem like it could be related to the first bug
> you pointed out, because even if I received less than the full 4 bytes
> and then erroneously assigned the incomplete value to recv_size, this
> would either result in me receiving *less* bytes than I expected,
> (because a 4 byte integer containing a value like 300000 for example
> would be if only 1 byte or 2 bytes were received),

if you are little-endian, and if buffer were zero'd out to begin with

> OR, if recv_size
> contained some initial garbage value in it, it would be set to some
> arbitrary high value, in which case I would recv() the full amount
> anyway.  This doesn't explain why I'd receive an additional ~1000
> bytes of garbage data.

Sounds correct.

> Is there any reason why recv() might get more data than was actually
> sent by send()?

nope.

Maybe others will have some ideas.

-frank





[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
David Schwartz


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 06:19 PM

On Mar 27, 12:47 am, chsal...@gmail.com wrote:

>                 do {
>                         switch (buffer.recv(new_socket, recv_size)) {
>                                 case -1: return SOCKET_ERR;
>                                 case 0: return CLOSE_SOCKET;
>                                 default: break;
>                         }
>                 } while (buffer.used_bytes < recv_size);
>
> With buffer.recv() defined as:
>
>         inline int recv(int m_sock, uint32_t maxrecv, int flags = 0) {
;
>                 if (used_bytes + maxrecv > size) realloc_buffer(used_bytes
 +
> maxrecv);
>                 int status = ::recv(m_sock, (byte*) data + used_bytes, max
recv,
> flags);
>                 if (status > 0) used_bytes += status;
>                 return status;
>         }

This is broken. Suppose you're trying to receive 10 bytes but you
receive 8. You now need to receive 2 more, but your code tries to
receive 10 again.

If I understand what you are trying to do, the correct fix is to
change:

switch (buffer.recv(new_socket, recv_size)) {

to

switch (buffer.recv(new_socket, buffer.used_bytes-recv_size)) {

However, IMO it is poor design to use the buffer level for this
purpose. If the protocol says I should be receiving 10 bytes, I want
to be dealing with the number 10, not 10 plus however many bytes were
already in the buffer.

So I think you should stop ignoring the return value from
'buffer.recv' and use it for its intended purpose -- to tell you how
many bytes you received so you know how many you still need to
receive.

DS






[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
Bin Chen


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 06:19 PM

On Mar 27, 2:56 pm, chsal...@gmail.com wrote:
> I have a C program running on Linux/UNIX that sends data over a
> network using the SOCK_STREAM protocol.  One computer sends and
> another receives.   The incoming data is prefixed with a 4 byte
> integer containing the size of the message.
>
> Problem: I've noticed that very rarely, the receiving computer will
> receive additional garbage bytes in the receive buffer.  I'm not
> exactly sure why this happens.  To fix this, I used a decreasing
> receive buffer size on each call, to match the remaining bytes on the
> message, rather than the usual fixed "buffersize" value I normally
> pass to receive.   But I've never seen this technique done anywhere
> else.  Usually, people just specify some fixed buffer size value for
> the recv() function, and the recv function never sends any more data
> than it receives from the socket anyway.  But if I do this, I receive
> garbage data sometimes.

Are you behind NAT? In my environment, the server will send 4 blank
chars to the client to keep the NAT window open.






[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
chsalvia@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-27-07 06:19 PM

> However, IMO it is poor design to use the buffer level for this
> purpose. If the protocol says I should be receiving 10 bytes, I want
> to be dealing with the number 10, not 10 plus however many bytes were
> already in the buffer.
>
> So I think you should stop ignoring the return value from
> 'buffer.recv' and use it for its intended purpose -- to tell you how
> many bytes you received so you know how many you still need to
> receive.
>
> DS

David, are you suggesting something more like this:

/* Read 4 bytes to obtain size of recv'd data */
do {
switch (buffer.recv(new_socket, sizeof(uint32_t))) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
} while (buffer.used_bytes < sizeof(uint32_t));

memcpy(&recv_size, buffer.data, sizeof(uint32_t));

uint32_t remaining = recv_size - sizeof(uint32_t);
uint32_t next;

do {
switch (next = buffer.recv(new_socket, remaining)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
remaining -= next;
} while (buffer.used_bytes < recv_size);

This is how I originally coded it.  This way, I only recv() the amount
of bytes that I expect should be remaining.  This seems to be what
you're saying to do.  But this brings me back to my original
question.  All the literature on recv() never indicates you should do
this.  And others here have indicated that recv() will never actually
recv() more than was sent.  So even if I do something like:

buffer.recv(new_sock, 10000000);

...I'll still only recv the amount that was sent.  So I shouldn't need
to keep subtracting the amount received in each iteration from the
value I pass to recv().






[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
David Schwartz


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-28-07 06:20 AM

On Mar 27, 10:41 am, chsal...@gmail.com wrote:

> David, are you suggesting something more like this:
>
>         /* Read 4 bytes to obtain size of recv'd data */
>         do {
>                 switch (buffer.recv(new_socket, sizeof(uint32_t))) {
>                         case -1: return SOCKET_ERR;
>                         case 0: return CLOSE_SOCKET;
>                         default: break;
>                 }
>         } while (buffer.used_bytes < sizeof(uint32_t));
>
>         memcpy(&recv_size, buffer.data, sizeof(uint32_t));
>
>         uint32_t remaining = recv_size - sizeof(uint32_t);
>         uint32_t next;
>
>         do {
>                 switch (next = buffer.recv(new_socket, remaining)) {
>                         case -1: return SOCKET_ERR;
>                         case 0: return CLOSE_SOCKET;
>                         default: break;
>                 }
>                 remaining -= next;
>         } while (buffer.used_bytes < recv_size);

> This is how I originally coded it.  This way, I only recv() the amount
> of bytes that I expect should be remaining.  This seems to be what
> you're saying to do.  But this brings me back to my original
> question.  All the literature on recv() never indicates you should do
> this.  And others here have indicated that recv() will never actually
> recv() more than was sent.  So even if I do something like:
>
> buffer.recv(new_sock, 10000000);
>
> ...I'll still only recv the amount that was sent.  So I shouldn't need
> to keep subtracting the amount received in each iteration from the
> value I pass to recv().

This is not a particularly good solution either. There are two
problems, one in complexity and one in performance. In complexity, it
requires two loops, one to get all the length bytes and one to get all
the data. In performance, it will require at least two call to recv
even if all the data is there.

The basic outline of a textbook solution is this:

1) Call 'recv' passing it however much buffer space you have left.

2) If you do not have enough data to know how many bytes you need, go
to step 1.

3) Compute how many bytes you need. If you don't have that many, go to
step 1.

4) Process the bytes you have received.

5) If you do not have any leftover bytes, empty the buffer and go to
step 1.

6) If you have leftover bytes, arrange things so the buffer only has
the leftover and go to step 2.

In pseudo code, assuming the packet length encoded in the data
includes the four bytes for the length.

int packet_len(unsigned char *buf_ptr)
{ /* unpack length of packet */
int ret=buf_ptr[0];
ret<=8; ret|=buf_ptr[1];
ret<=8; ret|=buf_ptr[2];
ret<=8; ret|=buf_ptr[3];
return ret;
}

int i, buf_ptr=0, packet_len;
unsigned char buf[BUF_SIZE];

/* This takes the packet without the length header */
extern void ProcessPacket(unsigned char *data, int length);

while(1)
{
i=recv(socket, buf+buf_ptr, BUF_SIZE-buf_ptr);
if(i<=0) return; /* error/eof */
buf_ptr+=i;
while( (buf_ptr>=4) &&
(buf_ptr>=(packet_len=get_len(buf))) )
{
ProcessPacket(buf+4, packet_len-4);
if(packet_len==buf_ptr) buf_ptr=0;
else
{ /* save leftover */
memmove(buf, buf+buf_ptr, buf_ptr-packet_len);
buf_ptr-=packet_len;
}
}
}

Hope that helps.

DS






[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
James Antill


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-28-07 06:18 PM

On Tue, 27 Mar 2007 10:41:13 -0700, chsalvia wrote:
 
>
> David, are you suggesting something more like this:
>
> 	/* Read 4 bytes to obtain size of recv'd data */ do {
> 		switch (buffer.recv(new_socket, sizeof(uint32_t))) {
> 			case -1: return SOCKET_ERR;
> 			case 0: return CLOSE_SOCKET;
> 			default: break;
> 		}
> 	} while (buffer.used_bytes < sizeof(uint32_t));


This is wrong. The common "simple" solution to this problem is something
like:

void full_read(int fd, void *buf, size_t len)
{
while (len > 0)
{ /* loop until all of the read request is done */
ssize_t ret = read(fd, buf, len);

if ((ret == -1) && (errno != EINTR))
err(EXIT_FAILURE, "write");
if (ret == -1) continue;

buf += (size_t)ret;
len -= (size_t)ret;
}
}


full_read(new_socket, &recv_size, sizeof(uint32_t));
recv_size = ntohl(recv_size);

if (recv_size < recv_size) /* err */ ;
recv_size -= sizeof(uint32_t);

if (recv_size > buf_size) /* err */ ;
if (!recv_size) /* nothing */ ;

full_read(new_socket, buf, recv_size);

/* deal with data in buf */

> This is how I originally coded it.  This way, I only recv() the amount
> of bytes that I expect should be remaining.  This seems to be what
> you're saying to do.  But this brings me back to my original question.
> All the literature on recv() never indicates you should do this.  And
> others here have indicated that recv() will never actually recv() more
> than was sent.  So even if I do something like:
>
> buffer.recv(new_sock, 10000000);
>
> ...I'll still only recv the amount that was sent.  So I shouldn't need
> to keep subtracting the amount received in each iteration from the value
> I pass to recv().

You don't have to worry about receiving more than was sent, but you have
to worry about two things:

1. receiving less than was sent, data can be split into packets at any
boundary ... so at any point you might only get a single byte from recv().

2. receiving more than was sent in a single write, data can be combined
into packets at any boundary ... so at any point you might get more data
than you "want".

...if the literature you are reading doesn't talk about those two points,
you need better literature.

--
James Antill -- james@and.org
http://www.and.org/and-httpd/ -- $2,000 security guarantee
http://www.and.org/vstr/





[ Post a follow-up to this message ]



    Re: recv() sends extra garbage bytes  
chsalvia@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
03-29-07 06:21 AM

Thanks for the example code.  Should the send() side do a similar loop?






[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 07:57 AM.      Post New Thread    Post A Reply      
Pages (2): [1] 2 »   Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register