[ic] Googlebot Getting 500 Errors ... but he's the only one

Jonathon Sim jonathonsim at zeald.com
Sun Jun 5 02:04:01 EDT 2005


On 8:31:19 am 06/05/05 Bryan Gmyrek <bryangmyrek at yahoo.com> wrote:
>
> Right after I upgraded to 1.32 with the patch from Jonathan the 500
> errors seem to have turned to 304 errors (see the middle of the log
> below): 
304 isn't strictly an error (its "Not modified" ) - but what this means  is
that apache examined the if-modified-since header from google and the
last-modified header  from interchange, and determined the page wasn't
modified since that time. So google will then not respider that page this
time around.  This however is probably not a good thing (unless some magic
code somewhere is determining that page really *isnt* modified)

NB: I don't think interchange is returning a last-modified header at all:
I'm guessing apache thinks the absence of one means "Not modified".

I have modified my test script to be a bit smarter: now you can give it a
last-modified date on the command line, otherwise it uses the current date
(rather than a hard-coded date from when I was testing this last year!).  I
cant get a 304 off my servers, but this script can cause that problem on
yours:

 use WWW::Mechanize;
 use Test::More tests => 2;
 use HTTP::Date;
 use Data::Dumper;
 my $url = $ARGV[0];
 my $date = $ARGV[1];
 my $ua = WWW::Mechanize->new;

 $ua->get($url);
 ok($ua->status == 200, 'check http status is 200 without if-modified-since
header');

 $date  = time2str() if ! $date;
 $ua->add_header('IF_MODIFIED_SINCE' => $date);

# $ua->add_header('IF_MODIFIED_SINCE' => 'Wed, 08 Sep 2004 11:09:13 GMT');
 $ua->get($url);
 ok($ua->status != 500 && $ua->status != 304, "check http status is not 500
or 304 with if-modified-since header set to $date  (status is
".$ua->status.")");

print "Response headers:";
 my $response = $ua->response();
 for my $key ( $response->header_field_names() ) {
   print $key, " : ", $response->header( $key ), "n";
 }

Run against your server:
/home/jonathonsim $ perl test_headers.pl http://www.neartexpress.com/liBENAB40035.html
1..2
ok 1 - check http status is 200 without if-modified-since header
not ok 2 - check http status is not 500 or 304 with if-modified-since
header set to Sun, 05 Jun 2005 05:44:31 GMT  (status is 304)
#     Failed test (test_headers.pl at line 17)
Response headers:Connection : close
Date : Sun, 05 Jun 2005 05:50:32 GMT
Server : Apache/1.3.33 (Unix) mod_interchange/1.32 PHP/4.3.9 mod_ssl/2.8.22
OpenSSL/0.9.7d
Client-Date : Sun, 05 Jun 2005 05:44:34 GMT
Client-Peer : 64.119.36.91:80
Client-Response-Num : 1
# Looks like you failed 1 tests of 2.

So at least now we have a bug to fix! (I wonder if this would have given
you a 500 error before).

Just out of interest, what happens when you choose a particular
last-modified date: eg Sun, 05 Jun 2005 05:30:00 GMT, then put in a test
page:

[tag op=header]Last-Modified: Sun, 05 Jun 2005 05:30:00 GMT[/tag]

and run the above script to send a date 5 minutes earlier ie:
perl test_headers.pl http://www.neartexpress.com/<test page>  "Sun, 05 Jun
2005 05:25:00 GMT".

--
Jonathon Sim <jonathonsim at zeald.com>
Senior Developer @ Zeald.com : http://www.zeald.com
Jabber:sim at jabber.org.nz ICQ:62562604 MSN:sim at zeald.com
Ph: +64 9 415 7575, Fax: +64 9 443 9794



More information about the interchange-users mailing list