Apache Server configuration support - rewrite or .htaccess? for filtering...

This is Interesting: Free IT Magazines  
Home > Archive > Apache Server configuration support > December 2007 > rewrite or .htaccess? for filtering...





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author rewrite or .htaccess? for filtering...
orbii

2007-12-09, 1:29 pm

hi, i've spent the last 2 days trying to make this work, but my background
w/ apache isn't all that well. i'm just a simple vba guy atm learning php,
any help are much appriciated.

i've installed apache for dual purpose...

1 - to host all my php files as i do my test and code learning

2 - to act as a web server for 127.0.0.1 hosts file redirect

you see, over the years i've collecting a very big list of bad sites, i.e.
http://www.mvps.org/winhelp2002/hosts.htm

problem is, i'm not sure how i'd setup the httpd.conf file. not even sure
if i should do it rewrite way or .htaccess way. read so much and the more i
read the more confused i get. i've tried the rewrite way it kept looping
over n over again, not to mention it does it to my own files.

so i guess anything that's being redirected by hosts file most likely will
be missing, would give a 404 error or something like that. is there a way,
which way should i say, to redirect any and all missing files to their
respective files in a folder like /blankfiles/*.* ? the folder would
already have all the blank files i pre created... is this even possible?
which would be the best way?

aloha and happy holiday,
orbii


shimmyshack

2007-12-10, 7:24 pm

On Dec 9, 4:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
> hi, i've spent the last 2 days trying to make this work, but my background
> w/ apache isn't all that well. i'm just a simple vba guy atm learning php,
> any help are much appriciated.
>
> i've installed apache for dual purpose...
>
> 1 - to host all my php files as i do my test and code learning
>
> 2 - to act as a web server for 127.0.0.1 hosts file redirect
>
> you see, over the years i've collecting a very big list of bad sites, i.e.http://www.mvps.org/winhelp2002/hosts.htm
>
> problem is, i'm not sure how i'd setup the httpd.conf file. not even sure
> if i should do it rewrite way or .htaccess way. read so much and the more i
> read the more confused i get. i've tried the rewrite way it kept looping
> over n over again, not to mention it does it to my own files.
>
> so i guess anything that's being redirected by hosts file most likely will
> be missing, would give a 404 error or something like that. is there a way,
> which way should i say, to redirect any and all missing files to their
> respective files in a folder like /blankfiles/*.* ? the folder would
> already have all the blank files i pre created... is this even possible?
> which would be the best way?
>
> aloha and happy holiday,
> orbii


you will love this, what you are trying to do is really acheivable and
very cool.
first get yourself a copy of bind, or another DNS server for your
platform, so that you can import all your hosts files into a zone
file, (so they reside in memory not in a hosts file)
next set up your apache to have at least two virtual hosts
the first can be called anything you like, I call mine
devnull
the second virtualhost should be called something nice like
webfilter.example.org
then you need to be running an http proxy - you can use the proxy
module in apache2.2 or can run a standalone. (I would prefer apache)
the computer that the apache server is on should be getting its DNS
from the DNS server setup earlier preloaded with your hosts file (you
can just use the hosts file until you get the DNS server up but every
single request will mean reading through that hosts file, which puts
load, and slowwwwws down your browsing ALOT - try a host file of order
16MB rather than your 400KB)

so lets just say for the sake of argument that you set apache up to
serve the correct mime type for a .pac file and create a proxy.pac
file with a couple of lines in it.
basically you say
PROXY webfilter.example.org:8080

or whatever port your proxy is listening on.

the proxy.pac file is included in firefox /IE etc... like

http://webfilter.example.org/proxy.pac

thats port 80 of course, so your browser doesnt do any DNS it forwards
the request on to webfilter.example.rg:8080,
the DNS is done there, and of course you have a load of host file
entries pointing to _IP_ADDRESS_OF_SERVER
so not 127.0.0.1 but 10.10.10.3 or whatever the actual ip address is
(so that remote computers on your lan can connect)

then when the proxy tries for a banned host it will redirect to the
local server and because the host is not matched, it will default to
the server in the first virtualhost section!
phew. anyway all you therefore need is a single error document
(written in php) which returns the

echo '<h1>'.$_SERVER['HTTP_HOST'].'</h1>';
echo '<p>has been blocked</p>';
echo '<p>if you think its an error then contact us to <a
href="mailto:admin@example.org?subject=unbock%20'.
$_SERVER['HTTP_HOST'].'">unblock</a>';


you get the point, for all hosts in your DNS server, you will get a
nice styled "blocked" page, for all non html pages your php can detect
and serve the right mimetype based on extension of requested file.

I mean ti really works, you can add authentication with basic stuff,
and only allow reqests if authenticated.
You can turn certain types of ban site on and off using time based
apache directives.
use urlblocklist for a really decent set of hosts, regular expressions
and urls.

using netfiltering you can add bandwidth shaping and mac address
filtering, so that you create a basic captive portal where people can
register mac addresses with a mysql database or else they are refused.
You can even pretrust the server public key in your browser, so you
can proxy https stuff as well without warnings.
You can log all requests in mysql so you can track your users.

OR.... you could just use squid and be done with it.
shimmyshack

2007-12-10, 7:24 pm

On Dec 9, 4:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
> hi, i've spent the last 2 days trying to make this work, but my background
> w/ apache isn't all that well. i'm just a simple vba guy atm learning php,
> any help are much appriciated.
>
> i've installed apache for dual purpose...
>
> 1 - to host all my php files as i do my test and code learning
>
> 2 - to act as a web server for 127.0.0.1 hosts file redirect
>
> you see, over the years i've collecting a very big list of bad sites, i.e.http://www.mvps.org/winhelp2002/hosts.htm
>
> problem is, i'm not sure how i'd setup the httpd.conf file. not even sure
> if i should do it rewrite way or .htaccess way. read so much and the more i
> read the more confused i get. i've tried the rewrite way it kept looping
> over n over again, not to mention it does it to my own files.
>
> so i guess anything that's being redirected by hosts file most likely will
> be missing, would give a 404 error or something like that. is there a way,
> which way should i say, to redirect any and all missing files to their
> respective files in a folder like /blankfiles/*.* ? the folder would
> already have all the blank files i pre created... is this even possible?
> which would be the best way?
>
> aloha and happy holiday,
> orbii


i got all enthusiastic with my reply sorry!
in fact you can do it your way which is really basic, and yeah you can
simply precreate some files , or do what i did using php

//find the extension by parsing off the stuff after the last dot
//map the extension onto a mimetype using a switch statement
switch($ext)
{
case 'pdf' : $mime = 'application/pdf'; break;
case 'htm' : $mime = 'text/html'; break;
default : $mime = 'text/plain'; break;
}
//send the
header( 'Content-Type: ' . $mime );

this way you dont have to create the files. Though you will have to
send 0 content length, and test to see whether it makes pages hang for
a response.
once again though my suggestion would be to use a default (first)
virtualhost, where no other virtualhosts will be matched by your
banned hosts, so the first will be used to serve your stuff, an
errordocument to 404.php should be used to match all files no matter
what the URL is for that blocked host.
orbii

2007-12-12, 1:28 pm

"shimmyshack" <matt.farey@gmail.com> wrote in message
news:4702f345-ae7d-48ac-8b7f-3a2b5ce315ab@w28g2000hsf.googlegroups.com...
> On Dec 9, 4:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
>
> you will love this, what you are trying to do is really acheivable and
> very cool.
> first get yourself a copy of bind, or another DNS server for your
> platform, so that you can import all your hosts files into a zone
> file, (so they reside in memory not in a hosts file)
> next set up your apache to have at least two virtual hosts
> the first can be called anything you like, I call mine
> devnull
> the second virtualhost should be called something nice like
> webfilter.example.org
> then you need to be running an http proxy - you can use the proxy
> module in apache2.2 or can run a standalone. (I would prefer apache)
> the computer that the apache server is on should be getting its DNS
> from the DNS server setup earlier preloaded with your hosts file (you
> can just use the hosts file until you get the DNS server up but every
> single request will mean reading through that hosts file, which puts
> load, and slowwwwws down your browsing ALOT - try a host file of order
> 16MB rather than your 400KB)
>
> so lets just say for the sake of argument that you set apache up to
> serve the correct mime type for a .pac file and create a proxy.pac
> file with a couple of lines in it.
> basically you say
> PROXY webfilter.example.org:8080
>
> or whatever port your proxy is listening on.
>
> the proxy.pac file is included in firefox /IE etc... like
>
> http://webfilter.example.org/proxy.pac
>
> thats port 80 of course, so your browser doesnt do any DNS it forwards
> the request on to webfilter.example.rg:8080,
> the DNS is done there, and of course you have a load of host file
> entries pointing to _IP_ADDRESS_OF_SERVER
> so not 127.0.0.1 but 10.10.10.3 or whatever the actual ip address is
> (so that remote computers on your lan can connect)
>
> then when the proxy tries for a banned host it will redirect to the
> local server and because the host is not matched, it will default to
> the server in the first virtualhost section!
> phew. anyway all you therefore need is a single error document
> (written in php) which returns the
>
> echo '<h1>'.$_SERVER['HTTP_HOST'].'</h1>';
> echo '<p>has been blocked</p>';
> echo '<p>if you think its an error then contact us to <a
> href="mailto:admin@example.org?subject=unbock%20'.
> $_SERVER['HTTP_HOST'].'">unblock</a>';
>
>
> you get the point, for all hosts in your DNS server, you will get a
> nice styled "blocked" page, for all non html pages your php can detect
> and serve the right mimetype based on extension of requested file.
>
> I mean ti really works, you can add authentication with basic stuff,
> and only allow reqests if authenticated.
> You can turn certain types of ban site on and off using time based
> apache directives.
> use urlblocklist for a really decent set of hosts, regular expressions
> and urls.
>
> using netfiltering you can add bandwidth shaping and mac address
> filtering, so that you create a basic captive portal where people can
> register mac addresses with a mysql database or else they are refused.
> You can even pretrust the server public key in your browser, so you
> can proxy https stuff as well without warnings.
> You can log all requests in mysql so you can track your users.
>
> OR.... you could just use squid and be done with it.



ok i've got to say this is GOLD!!! man, i have to give this a try
definately!!! will take a while to read up on all these and set it up right
but man, this is awesome. and yeah the hosts way is just kinda slowing my
machine down like a mad man. you have no idea how excited i am right now.
who would have thought proxy vhost would do the trick... this goes to show
how little i know. thank you so much, hope you dont' mind me adding you to
my address book, might bug you in the future for directions if possible

aloha, orbii


orbii

2007-12-12, 1:28 pm

"shimmyshack" <matt.farey@gmail.com> wrote in message
news:11686087-b22c-4c8e-bc7f-8373dd649974@e4g2000hsg.googlegroups.com...
> On Dec 9, 4:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
>
> i got all enthusiastic with my reply sorry!
> in fact you can do it your way which is really basic, and yeah you can
> simply precreate some files , or do what i did using php
>
> //find the extension by parsing off the stuff after the last dot
> //map the extension onto a mimetype using a switch statement
> switch($ext)
> {
> case 'pdf' : $mime = 'application/pdf'; break;
> case 'htm' : $mime = 'text/html'; break;
> default : $mime = 'text/plain'; break;
> }
> //send the
> header( 'Content-Type: ' . $mime );
>
> this way you dont have to create the files. Though you will have to
> send 0 content length, and test to see whether it makes pages hang for
> a response.
> once again though my suggestion would be to use a default (first)
> virtualhost, where no other virtualhosts will be matched by your
> banned hosts, so the first will be used to serve your stuff, an
> errordocument to 404.php should be used to match all files no matter
> what the URL is for that blocked host.


here's what i figured out the last day or so when there wasn't any
respond.... took me half a day of reading to figure out how to begin using
reg expressions, and didn't know there's a billion of language out there.
but got it worked out... this successful venture has gaven me enough of "WOW
it works" to try a lot more other things now. might pick up a 50$ box and
see if i can load linux on it and give all these FUN stuff a serious try, do
it the best way possible. heh, i htink you found yourself a new blood!

-------
<Directory />
Options +FollowSymLinks
AllowOverride none
Order deny,allow
Deny from all
Satisfy all

ReWriteEngine On

ReWriteBase /

ReWriteCond %{REQUEST_FILENAME} !-f
ReWriteCond %{REQUEST_FILENAME} !-d

## Shorter
ReWriteRule . /blank/_.htm [L]

## Longer
#ReWriteRule \.(css|flv|gif|jpe?g|js|php[3-5]?|p?html?|png|asp|chm|swf)$
/blank/_.$1 [L]

</Directory>


shimmyshack

2007-12-13, 1:53 am

On Dec 12, 2:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
> "shimmyshack" <matt.fa...@gmail.com> wrote in message
>
> news:11686087-b22c-4c8e-bc7f-8373dd649974@e4g2000hsg.googlegroups.com...
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> here's what i figured out the last day or so when there wasn't any
> respond.... took me half a day of reading to figure out how to begin using
> reg expressions, and didn't know there's a billion of language out there.
> but got it worked out... this successful venture has gaven me enough of "WOW
> it works" to try a lot more other things now. might pick up a 50$ box and
> see if i can load linux on it and give all these FUN stuff a serious try, do
> it the best way possible. heh, i htink you found yourself a new blood!
>
> -------
> <Directory />
> Options +FollowSymLinks
> AllowOverride none
> Order deny,allow
> Deny from all
> Satisfy all
>
> ReWriteEngine On
>
> ReWriteBase /
>
> ReWriteCond %{REQUEST_FILENAME} !-f
> ReWriteCond %{REQUEST_FILENAME} !-d
>
> ## Shorter
> ReWriteRule . /blank/_.htm [L]
>
> ## Longer
> #ReWriteRule \.(css|flv|gif|jpe?g|js|php[3-5]?|p?html?|png|asp|chm|swf)$
> /blank/_.$1 [L]
>
> </Directory>- Hide quoted text -
>
> - Show quoted text -



yes, no i bet you did exactly what i did and wrote that regular
expression to capture all the possible files that an advertiser might
send you, and then you are going to just rewrite them to a blank
document, which might be a php script, or just is a small nothing
document you made.
by remember the errordocument directive.
that says - if it aint there, then send the 404.php
I can see you want it to be invisible, no redirects which is why youre
using [L]
but errordocument should be "invisible" as well. No redirect, so no
extra hit. (I think, try it and see)
no more regular expressions needed, which might not catch something of
the form

/blah/de/blah/blahingtonblahblahs

theres all kinds crap they use. (o extension) some use this to serve
flash using something like your rewrite with an [L] flag to rewrite to
a script that gens the flash and takes care of the tracking.

anyway try the errordocument. It should not refresh the url if the
right response header is sent, i dont mind you adding me no. Check out
DansGuardian, and the URLBlock website, and Squid (set to not cache
cos the web is dynamic these days and you can do without that
headache!)


orbii

2007-12-13, 1:53 am

"shimmyshack" <matt.farey@gmail.com> wrote in message
news:e9f4480a-92ba-442d-b4bd-0792f705b555@a35g2000prf.googlegroups.com...
> On Dec 12, 2:19 pm, "orbii" <or...@hawaii.rr.com> wrote:
>
>
> yes, no i bet you did exactly what i did and wrote that regular
> expression to capture all the possible files that an advertiser might
> send you, and then you are going to just rewrite them to a blank
> document, which might be a php script, or just is a small nothing
> document you made.
> by remember the errordocument directive.
> that says - if it aint there, then send the 404.php
> I can see you want it to be invisible, no redirects which is why youre
> using [L]
> but errordocument should be "invisible" as well. No redirect, so no
> extra hit. (I think, try it and see)
> no more regular expressions needed, which might not catch something of
> the form
>
> /blah/de/blah/blahingtonblahblahs
>
> theres all kinds crap they use. (o extension) some use this to serve
> flash using something like your rewrite with an [L] flag to rewrite to
> a script that gens the flash and takes care of the tracking.
>
> anyway try the errordocument. It should not refresh the url if the
> right response header is sent, i dont mind you adding me no. Check out
> DansGuardian, and the URLBlock website, and Squid (set to not cache
> cos the web is dynamic these days and you can do without that
> headache!)
>
>


good god, if your a woman i'd marry u! LOL this is great stuff... omg so
much to learn, heh.

as for the errordoc, seems to be a bit extra overhead, not sure how to
explain it. i'm still messing around w/ it, trying to find the best and
cleanest way to do this.


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com