[ic] Possible bug: Too many new ID assignments for this IP address
John1
list_subscriber at yahoo.co.uk
Wed Aug 24 08:28:45 EDT 2005
On Wednesday, August 24, 2005 2:29 AM, mike at perusion.com wrote:
> Quoting John1 (list_subscriber at yahoo.co.uk):
>> In October 2004 I posted:
>> http://www.icdevgroup.org/pipermail/interchange-users/2004-October/041215.html
>> explaining what I thought was a bug which can result in *permanently*
>> blocked access to Interchange sites from ISPs who use proxy servers.
>>
>> To avoid this problem we are currently running with "RobotLimit 0",
>> so it's not really causing us a problem any more (although it would
>> be nice not to have to use RobotLimit 0).
>>
>> Here is the sub count_ip code (which is still the same as it was in
>> October 2004):
>>
>> sub count_ip {
>> my $inc = shift;
>> my $ip = $CGI::remote_addr;
>> $ip =~ s/\W/_/g;
>> my $dir = "$Vend::Cfg->{ScratchDir}/addr_ctr";
>> mkdir $dir, 0777 unless -d $dir;
>> my $fn = Vend::Util::get_filename($ip, 2, 1, $dir);
>> if(-f $fn) {
>> my $grace = $Vend::Cfg->{Limit}{robot_expire} || 1;
>> my @st = stat(_);
>> my $mtime = (time() - $st[9]) / 86400;
>> if($mtime > $grace) {
>> ::logDebug("ip $ip allowed back in due to '$mtime' > '$grace'
>> days"); unlink $fn;
>> }
>> }
>> return Vend::CounterFile->new($fn)->inc() if $inc;
>> return Vend::CounterFile->new($fn)->value();
>> }
>>
>> I believe crux of the problem is that this code is checking the last
>> *modified* time which actually has the effect of *permanently*
>> blocking large ISPs who use a relatively small number of proxy
>> servers.
>>
>> ########## snippet from my post in October 2004:
>> So, here is the problem: any IP address that is typically allocated
>> more than 1 session id in a 24 hr period will never get its addr_ctr
>> file expired. i.e. There needs to be a full 24 hr period without
>> access from the same IP address before the addr_ctr file will be
>> deleted thus re-allowing access from that IP address. For large
>> ISPs using a relatively small number of proxy servers this may
>> *never* happen, and so access
>> from their proxy servers is permanently blocked.
>> ##########
>
> I am perfectly willing to believe I have screwed up, but I had thought
> this had been addressed with
>
> Limit robot_expire 0.05
>
> This changes the 24-hour period to one hour. And since the first call
> is always to count_ip() without incrementing the counter (and
> therefore the mtime) the maximum lockout should be that one hour.
>
Do you mean "Since only the first call to count_ip() increments the counter
(and therefore the mtime) the maximum lockout should be that one hour?
If I am reading the code in count_ip correctly the addr_ctr/IP file will
only be deleted if its modified time is greater than "Limit robot_expire"
If I understand correctly, the code in sub new_session calls count_up(1)
(and therefore updates mtime if the addr_ctr/IP file already exists) each
time a new session is created.
Consequently the addr_ctr/IP file will keep counting up unless there is a
*gap* of greater than "limit robot_expire" before a new session id is
requested by the same IP address.
i.e. So if you use "Limit robot_expire 0.05", provided there are at least 2
requests per hour for a new session id from the same IP address the
addr_ctr/IP file will keep counting up forever.
Then after a few days or weeks RobotLimit will eventually be exceeded and
the IP address will then be *permanently* locked out. By permanent I mean
until there is a gap of at least 1 hour between requests for new session ids
from the IP address in question.
> If you have such traffic that you assign 100 legitimate IP addresses in
> an hour, it means you would have to have a much better robot defense
> than RobotLimit can supply....
>
So what I am saying above is that you don't need 100 accesses from the IP
address to maintain a lockout, you only need at least 2 each hour to
maintain the lockout situation.
> Also, a normal ISP proxy server should not see this; just if it is
> running behind a NAT. The IP address used is not the IP of the proxy
> server but the IP address of the user as sent by the proxy server.
>
I agree, but for some reason, in the UK at any rate, AOL appear to operate a
NAT proxy setup. Not sure why they do this, but they seem to - here are
some of the proxy servers that I found Interchange was blocking until I used
RobotLimit 0.
195.93.34.12 cache-loh-ac06.proxy.aol.com
212.100.251.149 lb1.onspeed.com
62.254.64.12 midd-cache-1.server.ntli.net
62.252.224.13 leed-cache-2.server.ntli.net
62.252.192.5 manc-cache-2.server.ntli.net
62.252.0.5 glfd-cache-2.server.ntli.net
62.254.0.14 nott-cache-2.server.ntli.net
80.5.160.4 bagu-cache-1.server.ntli.net
cache-los-ad06.proxy.aol.com
cache-los-ad02.proxy.aol.com
cache-los-ad03.proxy.aol.com
cache-los-ab04.proxy.aol.com
cache-los-ab01.proxy.aol.com
cache-los-aa02.proxy.aol.com
The ntli.net proxy servers belong to NTL who are a major UK cable TV
provider (and therefore also a large broadband provider).
> I run some pretty busy Interchange servers, and I never see trouble
> with this with the exception of NATs for fair-sized companies
> accessing their own IC server. Even then, the above "Limit" fixes the
> problem.
>
Ummm, it does seem strange that more people have not noticed this problem
(although a few postings to the list suggest that I am not alone). I can
accept that I maybe jumping to the wrong conclusion about the cause (and
perhaps I have missed something about how the code works), but hopefully I
have not misunderstood.
If I have understood correctly then perhaps one solution would be to purge
the addr_ctr directory at a regular intervals, say every 24hrs. That way
with a high enough RobotLimit and low enough Limit robot_expire the addr_ctr
would be purged before RobotLimit was ever exceeded. Perhaps an
AddrCtrExpire configuration directive could be added to do this?
More information about the interchange-users
mailing list