preventing DOS when serving up large files
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Web Servers reviews > Apache Server configuration support > preventing DOS when serving up large files




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    preventing DOS when serving up large files  
Ben Crowell


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-23-07 12:22 AM

I have a server that has some large PDF files on it (up to 15 Mb). I
make the files available in smaller, 50-page chunks, which seems to be
more convenient for most users, but some people really do want an entire
book as one huge PDF file. This generally hasn't been a problem over the
last few years. However, last night I found my server dead in the water,
not responding to my http requests, and just barely responding to me
when I ssh'd in. The log file looked like this:

59.78.2.1 - - [21/Jan/2007:19:35:37 +0000] "GET /bk1.pdf HTTP/1.1" 200 4
0960
59.78.2.1 - - [21/Jan/2007:19:35:38 +0000] "GET /bk1.pdf HTTP/1.1" 200 3
2768
59.78.2.1 - - [21/Jan/2007:19:35:40 +0000] "GET /bk4.pdf HTTP/1.1" 206
139264
59.78.2.1 - - [21/Jan/2007:19:35:41 +0000] "GET /bk3.pdf HTTP/1.1" 206 4
0960
59.78.2.1 - - [21/Jan/2007:19:35:42 +0000] "GET /bk2.pdf HTTP/1.1" 200 4
0960
59.78.2.1 - - [21/Jan/2007:19:35:44 +0000] "GET /bk2.pdf HTTP/1.1" 200 3
2768
59.78.2.1 - - [21/Jan/2007:19:35:45 +0000] "GET /bk2.pdf HTTP/1.1" 200 4
0960
59.78.2.1 - - [21/Jan/2007:19:35:46 +0000] "GET /bk2.pdf HTTP/1.1" 200 3
2768
59.78.2.1 - - [21/Jan/2007:19:35:47 +0000] "GET /bk2.pdf HTTP/1.1" 200 3
2768

I had about 200 apache child processes running. (MaxClients is set to
150, but I guess apache doesn't feel too constrained by that?) I'm
running Apache 1.3.

I'm not sure if this was actually a DOS attack, or just someone's poorly
written bot. I have mod_evasive installed, and normally it seems to work
well, but in this case it didn't seem to kick in; /var/log/messages
shows the IP being blacklisted, but only after I had actually worked
around the attack by denying access to the IP in my httpd.conf. Maybe
there is something in mod_evasive's algorithm that makes it not trigger
on this particular situation? Here is the relevant part of my config:
<IfModule mod_evasive.c>
DOSHashTableSize 3097
DOSPageCount 2
DOSSiteCount 50
DOSPageInterval 1
DOSSiteInterval 1
DOSBlockingPeriod 10
</IfModule>
(After I started sending back 403 responses to this IP, their
script kept pounding away with the same request, until I finally
got a chance today to ask mywebhost to block it at the router.)

Is there anything I can do that will make my apache configuration
deal more gracefully, in a fully automated way, with this situation?
AFAICT, the problem was that apache had as many child processes going
as it was willing to run, and since all of those were occupied with
responding to this script kiddie, it wasn't able to respond to other
requests. I imagine that raising MaxClients won't help, since one user
could still start enough processes to max me out. I could use
mod_bandwidth, but that doesn't seem like it would help either, since
their script doesn't actually seem to have been sucking down any more
packets after receiving the first one.

TIA!





[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Purl Gurl


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-23-07 06:32 AM

Ben Crowell wrote:

> 59.78.2.1 - - [21/Jan/2007:19:35:37 +0000] "GET /bk1.pdf HTTP/1.1" 200
  40960
> 59.78.2.1 - - [21/Jan/2007:19:35:38 +0000] "GET /bk1.pdf HTTP/1.1" 200
  32768
> 59.78.2.1 - - [21/Jan/2007:19:35:40 +0000] "GET /bk4.pdf HTTP/1.1" 206
  139264
> 59.78.2.1 - - [21/Jan/2007:19:35:41 +0000] "GET /bk3.pdf HTTP/1.1" 206
  40960
> 59.78.2.1 - - [21/Jan/2007:19:35:42 +0000] "GET /bk2.pdf HTTP/1.1" 200
  40960
> 59.78.2.1 - - [21/Jan/2007:19:35:44 +0000] "GET /bk2.pdf HTTP/1.1" 200
  32768
> 59.78.2.1 - - [21/Jan/2007:19:35:45 +0000] "GET /bk2.pdf HTTP/1.1" 200
  40960
> 59.78.2.1 - - [21/Jan/2007:19:35:46 +0000] "GET /bk2.pdf HTTP/1.1" 200
  32768
> 59.78.2.1 - - [21/Jan/2007:19:35:47 +0000] "GET /bk2.pdf HTTP/1.1" 200  32768[
/vbcol]

Average one hit per second, same page request average, one hit per second.

[vbcol=seagreen]
> I had about 200 apache child processes running. (MaxClients is set to
> 150, but I guess apache doesn't feel too constrained by that?) I'm
> running Apache 1.3.

> I'm not sure if this was actually a DOS attack, or just someone's poorly

>   DOSHashTableSize 3097
>   DOSPageCount 2
>   DOSSiteCount 50
>   DOSPageInterval 1
>   DOSSiteInterval 1
>   DOSBlockingPeriod 10
>   </IfModule>

Child processes are your true problem. These forked processes consume
CPU time and RAM memory. There are some bug reports on child processes
exceeding maximum limit, but this problem appears to have been patched
out; not a lot of reports on this problem for more recent 1.3.x versions.

You need to hard verify your Apache is not obeying MaxClients. This
is usually not a problem.

Reduce your Keep Alive Timeout to five (5) seconds.
Reduce your Max Keep Alive Requests according to actual need, maybe 100
Reduce your Max Requests Per Child to one-half of your current setting.
Reduce your Timeout to one-half of your current setting.

Set your MaxClients experimentally, 120 to 360, discover what happens.

Challenge here is those settings need to be fine tuned to your average
load demand. An example is if many of your static html pages contain
a lot of graphics, your keep alive might need to be set higher to allow
ample time for loading all graphics.

There are no average settings, default settings, which can be applied
to all servers. You must consider your average load, then fine tune
your settings to handle your load without dropped connections or
other problems. Determine the minimum settings you can use just
before problems begin, then increase all settings ten percent
of current numerical value.

On Dos Evasive, looking at your sample log record, this boy from
China did not trigger your evasive module; he was within limits.

These would be _very_ tight settings,

DOSPageCount 1
DOSSiteCount 5
DOSPageInterval 1
DOSSiteInterval 1

One page per second, five connections per second.

You can experiment, discover results. Some spider bots might
be knocked out, possibly some clients with high speed broadband
will be knocked out.

You can set,

DOSBlockingPeriod 1

This would help to stop blocking of innocent clients for a long time.

Same challenge here. Your evasive settings need to be fine tuned
according to your average server load. Low load, tight settings.
High load, generous settings. Too tight of settings can cause
more problems than a DOS attack; be careful.

There are a number of "stress tester" software out there, for free.
Software like this will hit your server hard with requests so you
can observe how much stress your server can handle. Stress testers
will allow you to test your settings in a short period of time,
maybe late at night, or on a Sunday. You can test at a time your
server load is minimal; less disruptive to clients.

A neat trick on this is to write multiple httpd.conf files, each
with different settings, ranging from tight to generous. Rather
than editing the same conf file, you simply rename your conf file
in use, then rename a new conf file to test. Perform a hard restart
or a soft restart to load the new conf file. You are simply plugging
in a series of conf files for testing; quick and easy.

As to the boy in China, personally, I would .htaccess block him or
block him in your conf file, for good; never allow the server access.
This solves your problem instantly. Odds are not many over in China
are visiting your site; no harm done. Servers out of China represent
a large percentage of problem servers. My habit is simply to block
access and not worry about this.

Your absolute best cure for DOS attacks, which are infrequent these days,
is a firmware firewall. A hardware firewall between your server and the
internet is the best method and so easy to use.

Block the China boy and stop worrying! Kick him out, leave your server alone
.

Purl Gurl









[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Ben Crowell


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-23-07 06:32 AM

Wow, you really put a lot of time and work into that reply --
thanks!





[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Purl Gurl


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-23-07 06:26 PM

Ben Crowell wrote:

> Wow, you really put a lot of time and work into that reply --
> thanks!

De nada. Readers benefit more when ample discussion is afforded.

My experience is a large majority of Apache problems are attributed
to user configuration. Apache is relatively bug free, well, sorta.

Best and most stable versions of Apache are 1.3.2x through 1.3.26 versions.

Version 1.3.27 and up, have a handful of very serious bugs.

The 2.x versions are simply too buggy to be trusted. Those versions
contain too many whistles and bells which create bugs. These 2.x
versions are large, cumbersome, slow and would contribute to this
problem you note; excess CPU and RAM usage.

I have not tested the most recent version of the 2.x series. Might be
most bugs have been patched out. Many people are happy with those versions.

Readers should note I am highly biased to sleek, trim and efficient
programming. I really do not like whistles and bells; all business,
no nonsense.

I must stress a point; almost all Apache problems are contributed
to user configuration.

Returning to your DOS Evasive module, I am not so sure you really need
this module, these days. DOS attacks still happen but are becoming rare.
Years back, we read about DOS attacks almost daily, back when these
attacks were popular amongst pimple faced teenage idiot boys. Law
enforcement actions and improvements in software, have fairly much
eliminated this popularity of DOS attacks.

My personal opinion is best defense is to run Apache as efficient
as possible. This is to turn off all modules save for bare bones.
Another defense is to have a modern machine, one with a gigaHertz
or better CPU speed and a couple of gigabytes of memory.  A good
machine will handle circumstances like you experienced, which was
not a DOS attack, just "something" really stupid.

Block the server and return a 403 forbidden message. This is
a very efficient method. In a month or two, remove the block
and discover if the problem server has given up on your server.

The most noted problem I observe these days is email spammers
looking for proxy servers, and Chinese looking for a proxy server.
Even so, this is not a major problem.

Email spammers, I block. Searches for proxy servers from China,
I allow because of social concerns. Chinese looking for proxy
servers are simply trying to escape government restrictions
on internet access. I cannot fault the Chinese for this.

There is a number one major problem; spam email. Estimates are
ninety to ninety-five percent of all email is spam. Our email
server is clobbered by spammers hundreds of times per day.
Keeping ahead of email spammers is a very serious problem.
However, this is not an Apache problem.

This returns to my comment about a firmware firewall. This is
the best method; plugin firewall between your servers and the
internet. You can buy a good used firewall through Ebay for
a decent price.

Apache is not written to be a firewall. Adding modules is ok,
but there are limitations, as you discovered. A hardware firewall
is dedicated to this task of just such; being a firewall. Years
back I used our old Netscreen to prevent a variety of attacks,
including DOS attacks. This is not a problem today so I removed
our firewall. Now our firewall is back into the system to block
email spamming servers. This is a big problem today.

Advantage here is a firmware firewall will block attacks and
servers before entering your local system. Apache never "sees"
those problems, your email and dns servers never "see" those
problems. All "bad" traffic is halted at the firewall allowing
your servers to run smoothly and efficiently.

A quick and very good cure for almost all common problems
related to viruses, trojans and such, is simply to install
an inexpensive router, such as an older Linksys. Does not
matter if you have one computer or four computers. A router
works wonders for preventing a lot of problems, especially
by being able to block sensitive incoming "port" connections.
You can buy a brand new discontinued Linksys BEFSR41 router
for under twenty-five bucks through Ebay. Works great.

For your circumstances, I would trim down Apache to bare bones,
use tight settings, block problem servers, and let it go at that.
Then look at adding a firewall or, at least, adding a router.

Purl Gurl







[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Jim Hayter


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-23-07 06:26 PM

Purl Gurl wrote:

<snip>

>
> Block the server and return a 403 forbidden message. This is
> a very efficient method. In a month or two, remove the block
> and discover if the problem server has given up on your server.
>

For business reasons, we went to a simply 403 error page that explains
that the requestor's IP is blocked and provide a link to send an email
if they would like it removed.  It amazes me how few IPs that I block
ever result in a request to remove the block.  I track the blocks I
remove so if I block it again I can note it is the second time.  Getting
out of that block is *much* harder.

<snip>

Jim





[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Ben Crowell


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-24-07 06:55 AM

Jim Hayter wrote:
> For business reasons, we went to a simply 403 error page that explains
> that the requestor's IP is blocked and provide a link to send an email
> if they would like it removed.  It amazes me how few IPs that I block
> ever result in a request to remove the block.

I'm guessing that a lot of these are spammers who write scripts to
search for e-mail addresses, blogs and wikis to spam, etc. The spammer
doesn't intend it to be a DOS, but the script is written in a clueless
way that has that effect. He's probably running it from a machine with
a temporary DHCP-assigned address, and the machine may be a zombie.
The next person to be assigned the IP is unlikely to hit your site,
so he won't notice the block, and even if he did, he wouldn't think
to complain to his ISP.





[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
shimmyshack


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-24-07 06:56 AM



On 23 Jan, 00:48, Ben Crowell <crowel...@lightSPAMandISmatterEVIL.com>
wrote:
> I have a server that has some large PDF files on it (up to 15 Mb). I
> make the files available in smaller, 50-page chunks, which seems to be
> more convenient for most users, but some people really do want an entire
> book as one huge PDF file. This generally hasn't been a problem over the
> last few years. However, last night I found my server dead in the water,
> not responding to my http requests, and just barely responding to me
> when I ssh'd in. The log file looked like this:
>
> 59.78.2.1 - - [21/Jan/2007:19:35:37 +0000] "GET /bk1.pdf HTTP/1.1" 200
 40960
> 59.78.2.1 - - [21/Jan/2007:19:35:38 +0000] "GET /bk1.pdf HTTP/1.1" 200
 32768
> 59.78.2.1 - - [21/Jan/2007:19:35:40 +0000] "GET /bk4.pdf HTTP/1.1" 206
> 139264
> 59.78.2.1 - - [21/Jan/2007:19:35:41 +0000] "GET /bk3.pdf HTTP/1.1" 206
 40960
> 59.78.2.1 - - [21/Jan/2007:19:35:42 +0000] "GET /bk2.pdf HTTP/1.1" 200
 40960
> 59.78.2.1 - - [21/Jan/2007:19:35:44 +0000] "GET /bk2.pdf HTTP/1.1" 200
 32768
> 59.78.2.1 - - [21/Jan/2007:19:35:45 +0000] "GET /bk2.pdf HTTP/1.1" 200
 40960
> 59.78.2.1 - - [21/Jan/2007:19:35:46 +0000] "GET /bk2.pdf HTTP/1.1" 200
 32768
> 59.78.2.1 - - [21/Jan/2007:19:35:47 +0000] "GET /bk2.pdf HTTP/1.1" 200
 32768
>
> I had about 200 apache child processes running. (MaxClients is set to
> 150, but I guess apache doesn't feel too constrained by that?) I'm
> running Apache 1.3.
>
> I'm not sure if this was actually a DOS attack, or just someone's poorly
> written bot. I have mod_evasive installed, and normally it seems to work
> well, but in this case it didn't seem to kick in; /var/log/messages
> shows the IP being blacklisted, but only after I had actually worked
> around the attack by denying access to the IP in my httpd.conf. Maybe
> there is something in mod_evasive's algorithm that makes it not trigger
> on this particular situation? Here is the relevant part of my config:
>    <IfModule mod_evasive.c>
>    DOSHashTableSize 3097
>    DOSPageCount 2
>    DOSSiteCount 50
>    DOSPageInterval 1
>    DOSSiteInterval 1
>    DOSBlockingPeriod 10
>    </IfModule>
> (After I started sending back 403 responses to this IP, their
> script kept pounding away with the same request, until I finally
> got a chance today to ask mywebhost to block it at the router.)
>
> Is there anything I can do that will make my apache configuration
> deal more gracefully, in a fully automated way, with this situation?
> AFAICT, the problem was that apache had as many child processes going
> as it was willing to run, and since all of those were occupied with
> responding to this script kiddie, it wasn't able to respond to other
> requests. I imagine that raising MaxClients won't help, since one user
> could still start enough processes to max me out. I could use
> mod_bandwidth, but that doesn't seem like it would help either, since
> their script doesn't actually seem to have been sucking down any more
> packets after receiving the first one.
>
> TIA!


A few simple things you can do for this kind of bot, is
a) use a 2nd computer somewhere to keep an eye on your apache server
for you, tailing the last few lines of the access. and error logs and
emailing you if the initial connection time goes beyond a threshold.
b) (In that email provide an html link) to a script that is able to
modify your .htaccess file (if you have that turned on) to block
specific problem IPs if you don't like what the requests look like, set
the script to only allow connections from trusted IPs or to have a
simple login. Consider as was previously said a more personal message
allowing the user to get unblocked (if they are human) you can then use
the same email->link->.htaccess modifier to unblock them
c) Have a .htaccess rewrite set up for all files over a certain size
that you can switch on and off with a script which rewrites those files
to a content distribution network like CORAL. This prevents
slashdotting and other DOS type problems.
d) If you would prefer to keep a tighter control over simultaneous
connection limits, consider using a server side script as a gateway for
those files, if the connection limit is exceeded, the script sends 206
headers back instead of data. There are also Apache 1.3.x modules that
can help with this. The names of which I unhelpfully temporarily forget
(mod_bw mod bandwisth perhaps:
http://www.cohprog.com/v3/bandwidth/doc-en.html)

for your specific problem with large PDF's, have you considered
linearizing them, so that they can load page by page. You can then
allow users to view any page they wish simply by making links of the
form:
http://server.com/path_to/file.pdf#page=386
(from what I remember it work when opening a local filesystem pdf in
the url of a browser though)
This requires you know what page they want, or allow searching by your
users. You would burst and dump [tools available from
http://www.pdfhacks.com] the uncompressed contents of the PDF into a
database, page by page, or flat files of course, you can then allow
searching for keywords and generate a list of best match pages with
links to those pdfs. Then I guess an auto overflow DIV at the top of
the page with the results, followed by an embedded PDF so the user can
jump from result to result using the top scrolling DIV to control the
PDF. This works because Apache can serve 206 partial content, and
because from (approximately!!) around PDF1.3 2003/4 the leading PDF
view supports this functionality, even from within the pdf itself.






[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
shimmyshack


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-24-07 06:56 AM



On 24 Jan, 05:26, "shimmyshack" <matt.fa...@gmail.com> wrote:
> On 23 Jan, 00:48, Ben Crowell <crowel...@lightSPAMandISmatterEVIL.com>
> wrote:
>
>
> 
> 
> 
> 
> 
> 
> a) use a 2nd computer somewhere to keep an eye on your apache server
> for you, tailing the last few lines of the access. and error logs and
> emailing you if the initial connection time goes beyond a threshold.
> b) (In that email provide an html link) to a script that is able to
> modify your .htaccess file (if you have that turned on) to block
> specific problem IPs if you don't like what the requests look like, set
> the script to only allow connections from trusted IPs or to have a
> simple login. Consider as was previously said a more personal message
> allowing the user to get unblocked (if they are human) you can then use
> the same email->link->.htaccess modifier to unblock them
> c) Have a .htaccess rewrite set up for all files over a certain size
> that you can switch on and off with a script which rewrites those files
> to a content distribution network like CORAL. This prevents
> slashdotting and other DOS type problems.
> d) If you would prefer to keep a tighter control over simultaneous
> connection limits, consider using a server side script as a gateway for
> those files, if the connection limit is exceeded, the script sends 206
> headers back instead of data. There are also Apache 1.3.x modules that
> can help with this. The names of which I unhelpfully temporarily forget
> (mod_bw mod bandwisth perhaps:http://www.cohprog.com/v3/bandwidth...le.pdf#page=386
> (from what I remember it work when opening a local filesystem pdf in
> the url of a browser though)
> This requires you know what page they want, or allow searching by your
> users. You would burst and dump [tools available fromhttp://www.pdfhac
ks.com] the uncompressed contents of the PDF into a
> database, page by page, or flat files of course, you can then allow
> searching for keywords and generate a list of best match pages with
> links to those pdfs. Then I guess an auto overflow DIV at the top of
> the page with the results, followed by an embedded PDF so the user can
> jump from result to result using the top scrolling DIV to control the
> PDF. This works because Apache can serve 206 partial content, and
> because from (approximately!!) around PDF1.3 2003/4 the leading PDF
> view supports this functionality, even from within the pdf itself.

sorry: should have been

http://server.com/path_to/file.pdf#page=386
(from what I remember it DOES NOT work when opening a local filesystem
pdf in
the url of a browser though)
don't you just hate that






[ Post a follow-up to this message ]



    Re: preventing DOS when serving up large files  
Ben Crowell


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-25-07 12:21 AM

shimmyshack wrote:
> b) (In that email provide an html link) to a script that is able to
> modify your .htaccess file (if you have that turned on) to block
> specific problem IPs if you don't like what the requests look like, set
> the script to only allow connections from trusted IPs or to have a
> simple login. Consider as was previously said a more personal message
> allowing the user to get unblocked (if they are human) you can then use
> the same email->link->.htaccess modifier to unblock them

Aha! This sounds like pretty much what I want to do. I had been thinking
of doing something like this, but hadn't realized that .htaccess was
a way to do it. Thanks!





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 01:31 AM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register