[ic] Too many new ID assignments for this IP address ["bug"/reason now identified]

John1 list_subscriber at yahoo.co.uk
Fri Oct 29 13:57:29 EDT 2004


> On Friday, October 29, 2004 8:22 AM, j.vandijk at attema.nl wrote:

>> list_subscriber at yahoo.co.uk 28-10-04 23:01:23
>> We are getting a lot of errors in our Interchange error log like:
>>
>> "Too many new ID assignments for this IP address. Please wait at least
>> 24 hours before trying again. Only waiting that period will allow access.
>> Terminating."
>>
>> The IP addresses getting blocked are all ISP proxy servers.  For
>> example,
>>
>> 195.93.34.12          cache-loh-ac06.proxy.aol.com
>> 212.100.251.149    lb1.onspeed.com
>> 62.254.64.12          midd-cache-1.server.ntli.net
>> 62.252.224.13        leed-cache-2.server.ntli.net
>> 62.252.192.5          manc-cache-2.server.ntli.net
>> 62.252.0.5              glfd-cache-2.server.ntli.net
>> 62.254.0.14            nott-cache-2.server.ntli.net
>> 80.5.160.4              bagu-cache-1.server.ntli.net
>>
>> We have "RobotLimit 500" in catalog.cfg, and I am certain that our
>> site is not getting 500 page requests within any 30 second period, even
from
>> one of these cache servers which I appreciate is a proxy for many users.
>>
>> BTW, we don't have a SessionExpire entry in our catalog config (i.e.
>> going with the 1 day default), nor are we running with WideOpen.
>> ...

>> I also noticed the line in the count_ip (Session.pm) routine:
>> ::logDebug("ip $ip allowed back in due to '$mtime' > '$grace' days");
>>
>> However, it is interesting that I don't see any of these particular
>> logDebug entries in my error.log, suggesting that for whatever reason the
grace
>> period is never considered expired.
>> ...
>> Any help would be greatly appreciated as this problem is currently
>> rejecting a fair few potential customers with 403 errors.  Thanks.
>>
>>
> This is not working correct, in the mean time try a workaround:
> set robotlimit temporarely up to 1000
> or find files with blocked ip adresses and delete them.
> <catalogdir>\tmp\addr_ctr\*
>
Thanks for your tip Jan - actually you were the one who first put me on to
upping RobotLimit to 500.  Anyway, I have tried to follow the code and think
I have uncovered the reason for the problem.

The RobotLimit directive appears to be used for 2 distinct checks:

1) To block accesses from IP addresses that make more than RobotLimit page
requests in a 30 second period.
2) To block accesses from IP addresses that have been allocated more than
RobotLimit SessionIDs.

The "Too many new ID assignments" message is generated by check 2.  The code
for this check causes a particular problem to proxy servers.  Even where a
proxy server is not involved, check 2 could also conceivably deny access to
customers of any ISP that dynamically allocates IP addresses from a very
small pool of addresses.

The reason for this is that the check to see whether the addr_ctr file
should be expired is only made if RobotLimit has been exceeded.  And then,
the addr_ctr file is only deleted if the modify date is older than a day
(the default for MV_ROBOT_EXPIRE).

So, here is the problem:  any IP address that is typically allocated more
than 1 session id in a 24 hr period will *never* get its addr_ctr file
expired.  i.e.  There needs to be a full 24 hr period without access from
the same IP address before the addr_ctr file will be deleted thus
re-allowing access from that IP address.  For large ISPs using a relatively
small number of proxy servers this may *never* happen, and so access from
their proxy servers is *permanently* blocked.

In other words, *only* once the RobotLimit is exceeded is the
MV_ROBOT_EXPIRE check performed on every attempted access.  Trouble is that
after performing this check the addr_ctr file is incremented for the
attempted access (and hence the addr_ctr file's modify date is updated).
So, unless there are no *attempted* accesses from this IP address for a full
24 hour period, the addr_ctr file will never be expired.  The result -
access from that IP address will be permanently blocked.

Until this problem can be fixed I think I am going to run with "RobotLimit
0", unless anyone advises me that this is a particularly bad idea?  I
suppose another solution would be to run with "RobotLimit 1000" and *also*
delete <catalogdir>\tmp\addr_ctr\* in a daily cron job?  If my understanding
of the problem is correct - even setting "RobotLimit 10000" is not going to
solve the problem (without also regularly deleting the addr_ctr files) - it
will just delay the onset of the problem, as each proxy servers will over
time reach even a 10000 session id limit (and will then be denied access
permanently).

I know very little Perl so I am not right person to fix the code, but I will
be happy to contribute to any discussion on possible solutions.  Suggestion:
Would it be worth introducing a separate catalog config directive called
SessionLimit to be used in place of RobotLimit for check 2, allowing the two
checks to be configured independently?  Thanks.



More information about the interchange-users mailing list