[ic] Foundation catalog.cfg RobotLimit comment change

John Young john_young at sonic.net
Fri Oct 8 16:44:44 EDT 2004


Jon Jensen wrote:
> On Thu, 7 Oct 2004, John Young wrote:
> 
>>The following comment change (or similar) in Foundation's catalog.cfg
>>might help people better understand what they are setting with RobotLimit:
>>
>>
>>--- catalog.cfg.orig	Mon Mar 29 18:40:57 2004
>>+++ catalog.cfg	Thu Oct  7 12:41:02 2004
>>@@ -162,7 +162,7 @@
>>  WritePermission group
>>
>>  # If a specific user session accesses our catalog more than this many 
>>times
>>-# in a 30-second time period. If the limit is exceeded, the LockoutCommand
>>+# without a 30-second pause. If the limit is exceeded, the LockoutCommand
>>  # (if set) is executed. Set this to 0 if you're getting links to 127.0.0.1
>>  # during your testing.
>>  RobotLimit  100
> 
> 
> Hmm. That seems to obscure the situation rather than clarify it, to me.


Okay, how about:
"with no pauses of 30 seconds or more.  If the limit..."

Now, for further obscurity:

The problem is the current description in catalog.cfg is wrong (not simply
confusing).  Take another look at the code in Dispatch.pm.  It doesn't 
average
anything.  It only checks to see if the current time minus the last session
write time is greater than 30:

             if ($now - $Vend::Session->{'time'} > 30) {
                 $Vend::Session->{accesses} = 0;
             }
             else {
                 $Vend::Session->{accesses}++;

We don't know the number of accesses in the last 30 seconds, we only know
that it's been at least $Vend::Session->{accesses} SINCE there was a 30 
second
pause.  A particular session could make 10,000 visits in 29 seconds, 
then wait
for 30 seconds, then continue on... regardless of the RobotLimit setting.

The error message in Dispatch.pm is correct.  It says:

"WARNING: POSSIBLE BAD ROBOT. %s accesses with no 30 second pause."

If you wanted to know an accesses per 30-second period figure, you'd need
something like (untested):

         elsif($Vend::Cfg->{RobotLimitAvg}) {
             $Vend::Session->{accesses_timestamp} ||= $now;  # Unless 
set in init_session().
             if ($now - $Vend::Session->{accesses_timestamp} > 30) {
                 if($Vend::Session->{accesses} > $Vend::Cfg->{RobotLimitAvg}
                     and ! $Vend::admin
                     )
                 {
                     my $msg = errmsg(
                         "WARNING: POSSIBLE BAD ROBOT. %s accesses with 
no 30 second pause.",
                         $Vend::Session->{accesses},
                     );
                     do_lockout($msg);
                 } else {
                     $Vend::Session->{accesses_timestamp} = $now;
                     $Vend::Session->{accesses} = 0;
                 }
             }
         }

More exotic would be using a running average.  Since initial page
requests with less than a one second delay are common, it would be
too easy to trip this without high-res time.  Perhaps this would
work if no analysis were made for the first 10 accesses (partially
tested):

         elsif($Vend::Cfg->{RobotLimitFreq}) {
             if (++$Vend::Session->{freq_accesses} > 10) {
                 # Without high-res time:
                 my $deltatime = $now - $Vend::Session->{time};
                 $deltatime ||= 1;

                 $Vend::Session->{access_freq} =
                     ($Vend::Session->{access_freq} * 
($Vend::Session->{freq_accesses} - 1) + (1 / $deltatime))
                         / $Vend::Session->{freq_accesses};

                 if ($Vend::Session->{access_freq} < 
$Vend::Cfg->{RobotLimitFreq}
                         and ! $Vend::admin
                     )
                     {
                     my $msg = errmsg(
                         "WARNING: POSSIBLE BAD ROBOT. Access frequency 
of %s.",
                         $Vend::Session->{freq_accesses},
                     );
                     do_lockout($msg);
                 }
             }
         }


Either of those would, of course, require a new Cfg variable since
plenty of people already have RobotLimit set for the existing behavior.
The last segment of code above takes care to avoid conflict with
$Vend::Session->{accesses}, too.


John Young



More information about the interchange-users mailing list