Name

RobotIP — specify IP numbers or ranges that will be classified as crawler bots (search engines)

SYNOPSIS

IP_address_glob...

DESCRIPTION

The RobotIP directive defines a list of IP numbers which will be classified as crawler robots (search engines), and cause Interchange to alter its behavior to improve the chance of Interchange-served content being crawled and indexed.

Note that this directive (and all other work done to identify robots) only serves to improve the way in which Interchange pages are indexed, and to reduce server overhead for clients that don't require our full attention in the way humans do (for example, session information is not kept around for spider bots). Using this to "tune" the actual page content depending on a crawler visiting does not earn you extra points, and may in fact be detected by the robot and punished.

It's important to note that the directive accepts a wildcard list similar to globbing — * represents any number of characters, while ? represents a single character.

DIRECTIVE TYPE AND DEFAULT VALUE

Global directive

EXAMPLES

Example: Defining RobotIP

RobotIP <<EOR
  202.9.155.123,      204.152.191.41,         208.146.26.19,
  208.146.26.233,     209.185.141.209,        209.185.141.211,
  209.202.148.36,     209.202.148.41,         216.200.130.207,
  216.35.103.6?,      216.35.103.*,
EOR

NOTES

For more details regarding web spiders/bots and Interchange, see robot glossary entry.

AVAILABILITY

RobotIP is available in Interchange versions:

4.6.0-5.9.0 (git-head)

SOURCE

Interchange 5.9.0:

Source: lib/Vend/Config.pm
Line 487

['RobotIP',       'list_wildcard_full', ''],

Source: lib/Vend/Config.pm
Line 3842 (context shows lines 3842-3846)

sub parse_list_wildcard_full {
my $value = get_wildcard_list(@_,1);
return '' unless length($value);
return qr/^($value)$/i;
}

AUTHORS

Interchange Development Group

SEE ALSO

RobotHost(7ic), RobotUA(7ic), NotRobotUA(7ic)

DocBook! Interchange!