Apache Server configuration support - Apache "Referer" not logged

This is Interesting: Free IT Magazines  
Home > Archive > Apache Server configuration support > January 2007 > Apache "Referer" not logged





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Apache "Referer" not logged
thecoolone

2006-12-28, 7:35 am

I have apache running and use webalizer and awstats to check the stats.
But when i manually checked my log files i noticed that many entries
had no referer link but "-" instead. And for most of the request the
HTTP status code was 200 then 206 or 200 and then 304.
Here are some entries:
193.95.85.99 - - [26/Dec/2006:04:40:43 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 200 153446 "-" "Mozilla/4.0
(compatible; MSIE 5.5; Windows 98)"
193.95.85.99 - - [26/Dec/2006:04:40:50 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 206 134453 "-" "Mozilla/4.0
(compatible; MSIE 5.5; Windows 98)"
193.95.85.99 - - [26/Dec/2006:04:40:57 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 200 153446 "-" "contype"
193.95.85.99 - - [26/Dec/2006:04:40:59 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 206 153054 "-" "Mozilla/4.0
(compatible; MSIE 5.5; Windows 98)"

213.42.21.75 - - [26/Dec/2006:06:00:13 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 200 153446 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)"
213.42.21.75 - - [26/Dec/2006:06:00:18 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 200 153446 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)"
213.42.21.75 - - [26/Dec/2006:06:00:29 -0500] "GET
/pdf/21-12-06_myfile.pdf HTTP/1.1" 206 68629 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)"

Is this referer spam or a bot running ??? I cannot understand why there
are requests the same file twice successively (i.e. two /GET back to
back)?

any help is appreciated.

HansH

2007-01-06, 8:11 pm

"thecoolone" <jahan9@gmail.com> schreef in bericht
news:1167312340.750543.231650@79g2000cws.googlegroups.com...
> I have apache running and use webalizer and awstats to check the stats.
> But when i manually checked my log files i noticed that many entries
> had no referer link but "-" instead.

The referer will contain the URL of the parent page. Hence it is not be
available for links type in the addresbar or selected from bookmarks.
Further, most browsers an option to disalble it from sending the referer.

> And for most of the request the HTTP status code was 200 then
> 206 or 200 and then 304.
> Here are some entries:
> 193.95.85.99 - - [26/Dec/2006:04:40:43 -0500] "GET
> /pdf/21-12-06_myfile.pdf HTTP/1.1" 200 153446 "-" "Mozilla/4.0
> (compatible; MSIE 5.5; Windows 98)"
> 193.95.85.99 - - [26/Dec/2006:04:40:50 -0500] "GET
> /pdf/21-12-06_myfile.pdf HTTP/1.1" 206 134453 "-" "Mozilla/4.0
> (compatible; MSIE 5.5; Windows 98)"
> Is this referer spam or a bot running ???


> I cannot understand why there are requests the same file twice
> successively (i.e. two /GET back to back)?

Typically, the browser and acrobatreader make successive requests.
Acrobatreader will request parts 'byte ranges' of the file, thus apache
responds 206 rather than 200. Repeatetive partial request may occure
while the user is reading the PDFdocument not sequential.

304 response is common for requests made for updates after the date
available in cache and indicating no such update is available

HansH



thecoolone

2007-01-06, 8:11 pm


HansH wrote:
> "thecoolone" <jahan9@gmail.com> schreef in bericht
> news:1167312340.750543.231650@79g2000cws.googlegroups.com...
> The referer will contain the URL of the parent page. Hence it is not be
> available for links type in the addresbar or selected from bookmarks.
> Further, most browsers an option to disalble it from sending the referer.


can u elaborate by what do u mean by parent page. Suppose i have a blog
and post a link to the pdf file does my blog become the parent page or
is the parent page the "parent page" in the domain that contains the
pdf file.


>
> Typically, the browser and acrobatreader make successive requests.
> Acrobatreader will request parts 'byte ranges' of the file, thus apache
> responds 206 rather than 200. Repeatetive partial request may occure
> while the user is reading the PDFdocument not sequential.
>
> 304 response is common for requests made for updates after the date
> available in cache and indicating no such update is available


thanx for the explanations.

thecoolone

2007-01-06, 8:11 pm

in addition to the previous question posted in the above post i wanted
to ask, now since u mentioned that acrobat and browser requests
separately for the pdf file, my question is that does the log analyzer
softwares like AWStats and Webalizer consider the 206 and 304 HTTP
CODE's as requests and bloat the hits count of people actually
downloading/viewing the file.

Suppose the pdf file is of 3 pages and a user first requests throught
the browser [HTTP CODE] (200, count=1) and then acrobat sends a request
again (200, count=2) and then after reading the first page the user
clicks next page and so acrobat requests again (206, count=3) and then
user wants to read the last page and so clicks next and now acrobat
again requests (206, count=4). So now does awstats and webalizer
consider downloads for the single pdf file as 4 downloads or does it
consider it as just 2 for the two 200 requests??

And does the same principle apply for 304 code ??

thank you.

HansH

2007-01-06, 8:11 pm

"thecoolone" <jahan9@gmail.com> schreef in bericht
news:1167314677.067335.41970@79g2000cws.googlegroups.com...
> can u elaborate by what do u mean by parent page.

Ok, the just made up term 'parent page' is too close to 'parent folder'.

> Suppose i have a blog and post a link to the pdf file does my blog
> become the parent page or

Yes, the page viewed by a browser becomes the referrer for any page
-or image or stylesheet or whatever- it has a href for ...

> is the parent page the "parent page" in the domain that contains the pdf
> file.

.... anything else is beyond the knowledge of the browser.

HansH



HansH

2007-01-06, 8:11 pm

"thecoolone" <jahan9@gmail.com> schreef in bericht
news:1167318702.904914.132120@n51g2000cwc.googlegroups.com...
> in addition to the previous question posted in the above post i wanted
> to ask, now since u mentioned that acrobat and browser requests
> separately for the pdf file, my question is that does the log analyzer
> softwares like AWStats and Webalizer consider the 206 and 304 HTTP
> CODE's as requests and bloat the hits count of people actually
> downloading/viewing the file.
>
> Suppose the pdf file is of 3 pages and a user first requests throught
> the browser [HTTP CODE] (200, count=1) and then acrobat sends a request
> again (200, count=2) and then after reading the first page the user
> clicks next page and so acrobat requests again (206, count=3) and then
> user wants to read the last page and so clicks next and now acrobat
> again requests (206, count=4).

The 206 wil only occur if the user wants to read parts of the document
faster than the inital request can load. Once the document is fully loaded
no further requests are made.

> So now does awstats and webalizer consider downloads for the single
> pdf file as 4 downloads or does it consider it as just 2 for the two 200
> requests??
>
> And does the same principle apply for 304 code ??
>

Made a short plunge in AWstats config and doc:
By default only the codes 200 and 304 are taken into account.
A unique visit ends after not-requesting new document for about 60 minutes,
I assume multiple requests for the same document are considered a single
view untill paused for about 60 min too.


HansH



thecoolone

2007-01-06, 8:11 pm

thanx a lot. you cleared most of my doubts

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2009 webservertalk.com