[ic] Googlebot Getting 500 Errors ... but he's the only one

Bryan Gmyrek bryangmyrek at yahoo.com
Sat Jun 4 16:31:19 EDT 2005


--- Peter <peter at pajamian.dhs.org> wrote:
> 
> Just on a hunch, try setting the if-modified-since header to a date in 
> the future for this test and see what happens...
> 
Thanks for the suggestion Peter.  I tried that as it seemed like a good hunch but no joy.  I
didn't seem to matter what the date was i still got OK with my testing.

However, Jonathan's code seems to have fixed the Googlebot 500 error problem [but see p.s. ..
there are still some 500 errors around]:

I downloaded mod_interchange 1.32 and implemented Jonathan's patch on mod_interchange.c

Right after I upgraded to 1.32 with the patch from Jonathan the 500 errors seem to have turned to
304 errors (see the middle of the log below):
66.249.71.72*[01/Jun/2005:16:37:53 -0700]*GET /liAPP201.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*14516*-
66.249.71.73*[01/Jun/2005:16:37:59 -0700]*GET /q/Photog.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*26871*-
66.249.71.18*[01/Jun/2005:16:38:05 -0700]*GET /artist/Carlo_Carra.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*10463*-
66.249.64.58*[01/Jun/2005:16:38:05 -0700]*GET /liAWINR655.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.29*[01/Jun/2005:16:38:07 -0700]*GET /liSHDS1403.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.68*[01/Jun/2005:16:38:10 -0700]*GET /liMCGPFD1059.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.38*[01/Jun/2005:16:38:16 -0700]*GET /liBENAB5160.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*14575*-
66.249.64.55*[01/Jun/2005:16:38:21 -0700]*GET /liBENAB4339.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*15469*-
66.249.64.79*[01/Jun/2005:16:38:24 -0700]*GET /q/Daphne_Duck.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*10152*-
66.249.71.29*[01/Jun/2005:16:38:31 -0700]*GET /liISIAF1010.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*18594*-
66.249.64.33*[01/Jun/2005:16:38:39 -0700]*GET /liCAWKMF-12.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*17532*-
66.249.64.79*[01/Jun/2005:16:38:49 -0700]*GET /liTELLE024.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.39*[01/Jun/2005:16:38:50 -0700]*GET /q/Jug.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.18*[01/Jun/2005:16:38:59 -0700]*GET /artist/Peter_Ellenshaw.html
HTTP/1.0*-*Googlebot/2.1 (+http://www.google.com/bot.html)*200*12583*-
66.249.71.29*[01/Jun/2005:16:38:59 -0700]*GET /liSAIC228.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.68*[01/Jun/2005:16:39:03 -0700]*GET /liISIANF1026.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*14385*-
66.249.64.68*[01/Jun/2005:17:01:57 -0700]*GET /liTELLE223.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.32*[01/Jun/2005:17:04:20 -0700]*GET /liHHC1300139.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.73*[01/Jun/2005:17:09:39 -0700]*GET /liCAWCG-14C.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.38*[01/Jun/2005:17:10:18 -0700]*GET /liBENAA50336.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.32*[01/Jun/2005:17:10:56 -0700]*GET /liBENAA40207.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.32*[01/Jun/2005:17:11:08 -0700]*GET /liMCGJ387.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*17776*-
66.249.71.72*[01/Jun/2005:17:11:22 -0700]*GET /liARCRH3.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.72*[01/Jun/2005:17:11:38 -0700]*GET /liMCGPF64.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*16210*-
66.249.64.38*[01/Jun/2005:17:11:45 -0700]*GET /artist/Herring-Harris.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.18*[01/Jun/2005:17:17:46 -0700]*GET /liBENAA5784.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.18*[01/Jun/2005:17:18:08 -0700]*GET /liEDL6202.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*14544*-
66.249.64.58*[01/Jun/2005:17:18:08 -0700]*GET /liIMAA181.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*200*18434*-
66.249.64.68*[01/Jun/2005:17:18:22 -0700]*GET /q/Duncan/1.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*403*213*-
66.249.64.79*[01/Jun/2005:17:20:50 -0700]*GET /artist/Calkins.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-

Note that googlebot is served _no_ 304 codes before [01/Jun/2005:16:38:50 -0700] and no 500 errors
after that time.
So it seems that a patch to mod_interchange using the couple of lines from Jonathan really is in
order...

$grep Googlebot access_log | grep -v image | grep "\*500\*" | less
....
66.249.71.67*[01/Jun/2005:16:34:11 -0700]*GET /liISIAF1016.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.28*[01/Jun/2005:16:34:24 -0700]*GET /liOWP1623B.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.58*[01/Jun/2005:16:36:10 -0700]*GET /liGLXTCH6188.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.38*[01/Jun/2005:16:36:15 -0700]*GET /liTOHAHA39.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.68*[01/Jun/2005:16:37:19 -0700]*GET /artist/Alfred_Henderson.html
HTTP/1.0*-*Googlebot/2.1 (+http://www.google.com/bot.html)*500*532*-
66.249.64.58*[01/Jun/2005:16:38:05 -0700]*GET /liAWINR655.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.29*[01/Jun/2005:16:38:07 -0700]*GET /liSHDS1403.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.68*[01/Jun/2005:16:38:10 -0700]*GET /liMCGPFD1059.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.64.79*[01/Jun/2005:16:38:49 -0700]*GET /liTELLE024.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.39*[01/Jun/2005:16:38:50 -0700]*GET /q/Jug.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
66.249.71.29*[01/Jun/2005:16:38:59 -0700]*GET /liSAIC228.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*500*532*-
END


$grep Googlebot access_log | grep -v image | grep "\*304\*" | less
66.249.64.68*[01/Jun/2005:17:01:57 -0700]*GET /liTELLE223.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.32*[01/Jun/2005:17:04:20 -0700]*GET /liHHC1300139.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.73*[01/Jun/2005:17:09:39 -0700]*GET /liCAWCG-14C.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.38*[01/Jun/2005:17:10:18 -0700]*GET /liBENAA50336.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.32*[01/Jun/2005:17:10:56 -0700]*GET /liBENAA40207.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.72*[01/Jun/2005:17:11:22 -0700]*GET /liARCRH3.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.38*[01/Jun/2005:17:11:45 -0700]*GET /artist/Herring-Harris.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.18*[01/Jun/2005:17:17:46 -0700]*GET /liBENAA5784.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.79*[01/Jun/2005:17:20:50 -0700]*GET /artist/Calkins.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.71.72*[01/Jun/2005:17:22:29 -0700]*GET /liADLAA-LA073.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.55*[01/Jun/2005:17:23:07 -0700]*GET /liCAWPAC-3.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
66.249.64.38*[01/Jun/2005:17:23:23 -0700]*GET /liHHC1300132.html HTTP/1.0*-*Googlebot/2.1
(+http://www.google.com/bot.html)*304*-*-
...

Today Googlebot has grabbed over 2000 pages, gotten about 1000 304 codes and NO 500 codes.
So it looks like it's fixed (but I'll still keep watching ;)  Thanks so much for the help from
peeps on this list.

Best,
Bryan

p.s.
There are still _some_ 500 codes returned ... here's my '500 error code report' for so far today:
500 Error Code Pages report
     17 POST /search.html HTTP/1.1*500
     10 GET /q/ARC.html HTTP/1.1*500
      8 GET /q/Southwest.html HTTP/1.1*500
      8 GET /q/Joan_Cawley_Gallery.html HTTP/1.1*500
      5 GET
/scan/MM=0c66d5e77931c8ef99c3742c776ce84c:28:55:28.html?mv_more_ip=1&mv_nextpage=artists%2dalphabetical%2ehtml&pf=cat&mv_arg=
HTTP/1.1*500
      5 GET /pre-framed.html HTTP/1.1*500
      5 GET /art/images/thumb/limages/EUR/thumbs/1500-19.jpg HTTP/1.1*304
      5 GET /art/images/thumb/limages/BEN/thumbs/AA50086.jpg HTTP/1.1*304
      4 GET /q/Carney.html HTTP/1.1*500
      4 GET /liMCGM366.html?mv_pc=froogle HTTP/1.1*500
...

Apache error_log (some of it ... 77 total errors related to this)
$cat error_log | grep header 
[Thu Jun  2 12:43:23 2005] [error] (104)Connection reset by peer: access to /q/Halsman.html failed
for 131.249.6.206, reason: error sending headers to client
[Thu Jun  2 12:45:53 2005] [error] (104)Connection reset by peer: access to /search.html failed
for 192.35.79.70, reason: error sending headers to client
[Thu Jun  2 13:45:55 2005] [error] (104)Connection reset by peer: access to /liAPG136-18406.html
failed for 70.17.136.124, reason: error sending headers to client
[Thu Jun  2 13:46:35 2005] [error] (104)Connection reset by peer: access to /q/Moise Jacobber.html
failed for 81.240.229.184, reason: error sending headers to client
[Thu Jun  2 13:51:57 2005] [error] (104)Connection reset by peer: access to /liOWP12052C.html
failed for 66.240.119.128, reason: error sending headers to client
[Thu Jun  2 13:53:33 2005] [error] (104)Connection reset by peer: access to /liCLACC2029.html
failed for 66.240.119.128, reason: error sending headers to client
[Thu Jun  2 13:53:38 2005] [error] (104)Connection reset by peer: access to /liCLACC2029.html
failed for 66.240.119.128, reason: error sending headers to client
[Thu Jun  2 13:55:41 2005] [error] (104)Connection reset by peer: access to /RW32041.html failed
for 70.17.47.196, reason: error sending headers to client

Apache access_log entry relating to last one above:
70.17.47.196*[02/Jun/2005:13:55:41 -0700]*GET /RW32041.html
HTTP/1.1*http://search.msn.com/results.aspx?FORM=MSNH&srch_type=0&q=SOUND+ACTIVATED+DESK+FOUNTAIN+*Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts-MyWay; SV1)*500*0*-
70.17.47.196*[02/Jun/2005:13:55:41 -0700]*GET /RW32041.html
HTTP/1.1*http://search.msn.com/results.aspx?q=sound-activated+fountain&FORM=QBRE*Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts-MyWay; SV1)*500*0*-

This is strange ... there are two attempts at the same time from msn search, from the same ip,
with the same browser but once with caps search and a + on the end and once with no caps and no
plus on the end.  Looking at another one i see a similar problem:
 $cat access_log | grep liCAMAL993.html | grep -v image
68.185.252.156*[02/Jun/2005:16:27:25 -0700]*GET /liCAMAL993.html?mv_pc=froogle
HTTP/1.1*http://www.google.com/search?hl=en&q=university+of+alabama+prints&btnG=Google+Search*Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)*500*0*-
68.185.252.156*[02/Jun/2005:16:27:27 -0700]*GET /liCAMAL993.html?mv_pc=froogle
HTTP/1.1*http://www.google.com/search?hl=en&q=university+of+alabama+prints&btnG=Google+Search*Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)*500*0*-

People double clicking on the search results?

interchange error log for today:

/opt/interchange$ cat error.log
- - - [01/June/2005:23:48:50 -0700] - - STOP server (29274) on signal TERM
- - - [01/June/2005:23:48:53 -0700] - - Vend::Payment::AuthorizeNet payment module initialized,
using Net::SSLeay
- - - [01/June/2005:23:48:53 -0700] - - Low traffic settings.
- - - [01/June/2005:23:48:54 -0700] - - ...UI is loaded...
- - - [01/June/2005:23:48:54 -0700] - - Interchange V5.2.0
- - - [01/June/2005:23:48:54 -0700] - - Config 'art' at server startup
- - - [01/June/2005:23:48:54 -0700] - - Config 'foundation520' at server startup
- - - [01/June/2005:23:48:55 -0700] - - START server (23755) (UNIX)
- - - [01/June/2005:23:48:55 -0700] - - ALERT: /opt/interchange/etc/socket socket permissions are
insecure; are you sure you want permissions 666?
- - - [01/June/2005:23:48:55 -0700] - - START server (23755) (UNIX)




More information about the interchange-users mailing list