[ic] bad robots - undefined session ids?
db at m-and-d.com
Wed Dec 18 20:48:22 UTC 2013
> I've noticed that when I use top, I sometimes see IC processes with what
> looks like an undefined session id.
> Ones like this I believe should be robots:
> interchange: Store 18.104.22.168 nsession - /page.html
> Ones like this I believe should be regular users
> interchange: Store 22.214.171.124 WydgPRgP - /page.html
> But what about ones like this?
> interchange: Store 126.96.36.199 - /scan/se=....
> these with an undefined session id often seem to be misbehaving -
> bad/evil robots doing many and frequent /scan/... The IPs often seem to
> be in Russia, China etc. Anyone have thought about this or how to prevent?
More info... Looking at my web server access log, these might be spiders
of some sort following expired 'more' scan links:
188.8.131.52 www.domain.com - [18/Dec/2013:13:50:34 -0500] "GET
HTTP/1.1" 200 30213 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT
6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET
CLR 3.0.30729; Media Center PC 6.0)"
I know this isn't a new issue, and for some time I've had 'Disallow:
/scan/' in my robots.txt. But not all robots behave. Any new ideas about
how to handle this? I know Racke will suggest moving to IC6 :)
More information about the interchange-users