[ic] RFC: $Vend::Robot and BounceRobotSessionURL
David Christensen
david at endpoint.com
Wed Oct 7 21:01:05 UTC 2009
Folks,
Seeking any comments/code review for the following two patches
(available in my github repo):
"Add new $Vend::Robot variable to track when we're dealing with an
actual RobotUA":
http://github.com/machack666/interchange/commit/44c7b91e2596ae5e4d76b29446c94346b04693d8
"Add BounceRobotSessionURL directive":
http://github.com/machack666/interchange/commit/3da6fb97b4dc9b7b871864247342d5ae88929a2b
Including the full diffs below.
Regards,
David
----- 8< -----
commit 44c7b91e2596ae5e4d76b29446c94346b04693d8
Author: David Christensen <david at endpoint.com>
Date: Wed Oct 7 14:45:52 2009 -0500
Add new $Vend::Robot variable to track when we're dealing with an
actual RobotUA
This allows distinguishing between CGI-provided mv_tmp_session and
actual robot usage, which just happens to set mv_tmp_session as a
consequence.
diff --git a/lib/Vend/Server.pm b/lib/Vend/Server.pm
index ebbb7f3..a61d317 100644
--- a/lib/Vend/Server.pm
+++ b/lib/Vend/Server.pm
@@ -288,7 +288,7 @@ EOF
#::logDebug("Check robot UA=$Global::RobotUA IP=$Global::RobotIP");
if ($Global::RobotIP and $CGI::remote_addr =~
$Global::RobotIP) {
#::logDebug("It is a robot by IP!");
- $CGI::values{mv_tmp_session} = 1;
+ $Vend::Robot = 1;
}
elsif ($Global::HostnameLookups && $Global::RobotHost) {
if (!$CGI::remote_host && $CGI::remote_addr) {
@@ -297,18 +297,20 @@ EOF
}
if ($CGI::remote_host && $CGI::remote_host =~
$Global::RobotHost) {
#::logDebug("It is a robot by host!");
- $CGI::values{mv_tmp_session} = 1;
+ $Vend::Robot = 1;
}
}
- unless ($CGI::values{mv_tmp_session}) {
+ unless ($Vend::Robot) {
if ($Global::NotRobotUA and $CGI::useragent =~
$Global::NotRobotUA) {
# do nothing
}
elsif ($Global::RobotUA and $CGI::useragent =~
$Global::RobotUA) {
#::logDebug("It is a robot by UA!");
- $CGI::values{mv_tmp_session} = 1;
+ $Vend::Robot = 1;
}
}
+
+ $CGI::values{mv_tmp_session} = 1 if $Vend::Robot;
}
# This is called by parse_multipart
----- 8< -----
commit 3da6fb97b4dc9b7b871864247342d5ae88929a2b
Author: David Christensen <david at endpoint.com>
Date: Wed Oct 7 12:24:52 2009 -0500
Add BounceRobotSessionURL directive
Add BounceRobotSessionURL directive to 301 redirect robots which
provide an explicit mv_session_id to the canonical page URL without
the explicit mv_session_id. This prevents search engine urls from
being indexed with an explicit session_id.
diff --git a/WHATSNEW-5.7 b/WHATSNEW-5.7
index 37286fd..cc07588 100644
--- a/WHATSNEW-5.7
+++ b/WHATSNEW-5.7
@@ -14,6 +14,11 @@ Interchange 5.7.2 released 2009-09-17.
Core
----
+* Add BounceRobotSessionURL directive to 301 redirect robots which
+ provide an explicit mv_session_id to the canonical page URL without
+ the explicit mv_session_id. This prevents search engine urls from
+ being indexed with an explicit session_id.
+
* Close remote disclosure security vulnerability, and added new
configuration
option AllowRemoteSearch to selectively re-enable remote searches
on "safe"
tables. Defaults to products, variants and options.
diff --git a/lib/Vend/Config.pm b/lib/Vend/Config.pm
index 1468211..2ba2175 100644
--- a/lib/Vend/Config.pm
+++ b/lib/Vend/Config.pm
@@ -713,6 +713,7 @@ sub catalog_directives {
['UserTrack', 'yesno', 'no'],
['DebugHost', 'ip_address_regexp', ''],
['BounceReferrals', 'yesno', 'no'],
+ ['BounceRobotSessionURL', 'yesno', 'no'],
['OrderCleanup', 'routine_array', ''],
['SessionCookieSecure', 'yesno', 'no'],
['SessionHashLength', 'integer', 1],
diff --git a/lib/Vend/Dispatch.pm b/lib/Vend/Dispatch.pm
index caf3415..9acb588 100644
--- a/lib/Vend/Dispatch.pm
+++ b/lib/Vend/Dispatch.pm
@@ -1244,6 +1244,8 @@ sub dispatch {
$sessionid = $CGI::values{mv_session_id} || undef
and $sessionid =~ s/\0.*//s;
+ my $orig_sessionid = $sessionid; # save for robot check with
explicit session id
+
$::Instance->{CookieName} = $Vend::Cfg->{CookieName};
if($CGI::values{mv_tmp_session}) {
@@ -1552,7 +1554,8 @@ EOF
);
}
- if ($new_source and $CGI::request_method eq 'GET' and $Vend::Cfg-
>{BounceReferrals}) {
+ if (($new_source and $CGI::request_method eq 'GET' and $Vend::Cfg-
>{BounceReferrals})
+ or ($Vend::Robot and $orig_sessionid and $Vend::Cfg-
>{BounceRobotSessionURL})) {
my $path = $CGI::path_info;
$path =~ s:^/::;
my $form =
--
David Christensen
End Point Corporation
david at endpoint.com
More information about the interchange-users
mailing list