|
Home > Archive > Perlbal > September 2006 > potential pipe-lining corruption fix
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
potential pipe-lining corruption fix
|
|
| Brad Fitzpatrick 2006-09-20, 1:11 am |
| Today at the MogileFS summit Alan (from http://www.gaiaonline.com/)
mentioned that if he enables persistent backend connections, users get
mismatched requests to responses.... but not when he accesses it. And it
takes awhile to happen during low load.
All this pointed to errors handling malformed client requests, and sure
enough, I think this is it....
We don't disconnect users who do pipelining. I was able to sneak through
multiple requests to the backend and then two requests would come back,
but they'd be assigned to different users.
This fix doesn't break the test suite, but before I commit, Alan --- can
you verify it fixes things for you?
Index: lib/Perlbal/ClientProxy.pm
========================================
===========================
--- lib/Perlbal/ClientProxy.pm (revision 565)
+++ lib/Perlbal/ClientProxy.pm (working copy)
@@ -584,13 +584,11 @@
# (see: Danga::Socket::read)
return $self->client_disconnected unless defined $bref;
- # if we got data that we weren't expecting, something's bogus with
- # our state machine (internal error)
- if (defined $remain && ! $remain) {
- my $blen = length($$bref);
- my $content = substr($$bref, 0, 80 < $blen ? 80 : $blen);
- Carp::cluck("INTERNAL ERROR: event_read called on when we're expecting no more bytes. len=$blen, content=[$content]\n");
- $self->close;
+ # if they didn't declare a content body length and we just got a
+ # readable event that's not a disconnect, something's messed up.
+ # they're overflowing us. disconnect!
+ if (! $remain) {
+ $self->close("over_wrote");
return;
}
- Brad
| |
| dormando 2006-09-20, 1:11 am |
| I can't screw with the site past 4:30pm, so first thing tomorrow morning
I'll patch (hopefully just one of?) the LB's and try it out.
Thanks!
-Alan
Brad Fitzpatrick wrote:
> Today at the MogileFS summit Alan (from http://www.gaiaonline.com/)
> mentioned that if he enables persistent backend connections, users get
> mismatched requests to responses.... but not when he accesses it. And it
> takes awhile to happen during low load.
>
> All this pointed to errors handling malformed client requests, and sure
> enough, I think this is it....
>
> We don't disconnect users who do pipelining. I was able to sneak through
> multiple requests to the backend and then two requests would come back,
> but they'd be assigned to different users.
>
> This fix doesn't break the test suite, but before I commit, Alan --- can
> you verify it fixes things for you?
>
>
>
> Index: lib/Perlbal/ClientProxy.pm
> ========================================
===========================
> --- lib/Perlbal/ClientProxy.pm (revision 565)
> +++ lib/Perlbal/ClientProxy.pm (working copy)
> @@ -584,13 +584,11 @@
> # (see: Danga::Socket::read)
> return $self->client_disconnected unless defined $bref;
>
> - # if we got data that we weren't expecting, something's bogus with
> - # our state machine (internal error)
> - if (defined $remain && ! $remain) {
> - my $blen = length($$bref);
> - my $content = substr($$bref, 0, 80 < $blen ? 80 : $blen);
> - Carp::cluck("INTERNAL ERROR: event_read called on when we're expecting no more bytes. len=$blen, content=[$content]\n");
> - $self->close;
> + # if they didn't declare a content body length and we just got a
> + # readable event that's not a disconnect, something's messed up.
> + # they're overflowing us. disconnect!
> + if (! $remain) {
> + $self->close("over_wrote");
> return;
> }
>
>
> - Brad
| |
| Brad Fitzpatrick 2006-09-20, 1:11 am |
| Thanks!
On Tue, 19 Sep 2006, dormando wrote:
> I can't screw with the site past 4:30pm, so first thing tomorrow morning
> I'll patch (hopefully just one of?) the LB's and try it out.
>
> Thanks!
> -Alan
>
> Brad Fitzpatrick wrote:
>
>
| |
| dormando 2006-09-21, 1:11 pm |
| Looks like that fixed it! We've had zero reports of page swapping /
image swapping since applying the patch. Site's running much snappier
during peak load with backend keepalives working.
I'll probably follow up with some information on exactly what kind of
client is hitting the disconnect call in the patch... Not going to have
time until next week.
-Alan / Dormando / whatever.
dormando wrote:[vbcol=seagreen]
> I can't screw with the site past 4:30pm, so first thing tomorrow morning
> I'll patch (hopefully just one of?) the LB's and try it out.
>
> Thanks!
> -Alan
>
> Brad Fitzpatrick wrote:
| |
| Brad Fitzpatrick 2006-09-26, 7:11 pm |
| On Fri, 22 Sep 2006, Jacques Marneweck wrote:
> On Thu Sep 21 16:43:16 UTC 2006, dormando <dormando@rydia.net> wrote:
>
> Alan, that's great news 
>
> Brad what would the ETA for the next minor release of perbal be?
It's in svn, along with some new tests, but I'm not entirely happy with
its level of paranoia, safety-in-face-of-stupid-clients, and logging.
This week, though, if not later today.
>
> Regards
> --jm
>
>
|
|
|
|
|