[ic] Possible bug: Too many new ID assignments for this IP address
list_subscriber at yahoo.co.uk
Wed Aug 24 11:41:24 EDT 2005
On Wednesday, August 24, 2005 2:45 PM, mike at perusion.com wrote:
> Quoting John1 (list_subscriber at yahoo.co.uk):
>> On Wednesday, August 24, 2005 2:29 AM, mike at perusion.com wrote:
>> Consequently the addr_ctr/IP file will keep counting up unless there
>> is a *gap* of greater than "limit robot_expire" before a new session
>> id is requested by the same IP address.
> Yes, this is correct.
>> i.e. So if you use "Limit robot_expire 0.05", provided there are at
>> least 2 requests per hour for a new session id from the same IP
>> address the addr_ctr/IP file will keep counting up forever.
> Well, until it locks someone out for an hour.
Except it is highly likely to be a lot longer than an hour (possibly
indefinitely) if the IP in question is a large ISP's proxy server (using NAT
as do NTL and AOL in the UK - 2 of the biggest ISPs in the UK). Has anybody
any idea why AOL operate these NAT proxies?
>> Then after a few days or weeks RobotLimit will eventually be
>> exceeded and the IP address will then be *permanently* locked out.
>> By permanent I mean until there is a gap of at least 1 hour between
>> requests for new session ids from the IP address in question.
> Aha, there is my misunderstanding. I didn't see an hour as
> permanent.... 8-)
But do you understand why I use the word permanent? The IP *will* be locked
out essentially permanently if it belongs to an ISP operating NAT proxies.
If RobotLimit is set to 500, then whilst it may take a little while for the
500 to be reached, once it has been reached the shutter comes down and the
count_ip code operates like a latch as only *one* new session id per hour is
required to *keep* the latch closed, not 500!
And also note that RobotLimit 500 doesn't actually require traffic of 500
per hour for addr_ctr/IP to eventually reach 500. All that is needed is at
least *one* new session id per hour provided that it never drops below *one*
new session id per hour for the number of hours it takes to reach a count of
If a proxy server's IP address is active enough to trip the RobotLimit 500
then what we are saying is that it is likely to be requesting well in excess
of 1 new session id per hour. If not, it would have been unlikely to have
made it all the way to 500 without the addr_ctr/IP being reset. The trouble
is that once the 500 limit has been crossed *only* 1 new session_id per hour
is required to keep the latch closed and so lock out probably will be
permanent for this IP address.
> Looking at it, it may indeed be less than ideal. Perhaps someone can
> suggest an algorithm -- nothing clean and correct comes to my mind
> (new file every day, counting down instead of up if time >
> Limit->robot_expire * .1, etc.).
> In the interim, I would think
> Limit robot_expire 0.002
> would work in all but the most extreme cases, where again I suggest
> you need more than RobotLimit to defend you from the onslaught.
That's a fair point. I hadn't given any thought to the use of Limit
robot_expire with very small values. A value of 0.002 would means that
addr_ctr/IP would be deleted if there were no accesses from the same IP for
3 minutes. I guess that would work most of the time as I suppose in the
middle of the night (if not during the day) requests for new session ids are
likely to drop below this level at least once and therefore the addr_ctr/IP
file will at least be deleted once every 24 hours.
At the same time I suppose a 3 minute expiry limit is long enough to provide
protection against unrecognised and unruly robots causing lots of new
sessions to be spawed in quick succession - I guess this would tend to
happen over a timeframe of seconds rather than minutes, so the 3 minutes
should be sufficient to mitigate against this. Is this assumption correct?
Do I understand the issue of runaway robots correctly?
Thanks for your time and help on this Mike.
More information about the interchange-users