[ic] MOD_REWRITE and query strings

interchange-users@icdevgroup.org interchange-users@icdevgroup.org
Sat Sep 7 16:15:02 2002


Quoting mike@perusion.com (mike@perusion.com):
> Quoting Jeff Dafoe (jeff@badtz-maru.com):
> >     The way to approach this would probably be modifications to
> > interchange's session management and it sounds like Mike already has some
> > sort of solution for interchange 4.9 .
> 
> Yes. And this discussion has been good because it has allowed me to look
> at possibly defining a "MV_ROBOT_UA" variable or even a directive like
> RobotUA *global* to automatically enable mv_tmp_session on access by a
> spider matching that spec.
> 
> If someone really wants this for the stable tree and is willing to pay
> for it, any of the consultants who are a part of the ICDEVGROUP core
> team would undoubtedly be able to merge in the 4.9 changes into the
> stable tree. The amount of time would probably be two hours or less.

It is in devel, the comment for the CVS change goes:

> * Add robot tolerance facility, where mv_tmp_session is set when either
>   a RobotUA or RobotIP wildcard matches.
> 
>   In interchange.cfg:
> 
>           RobotUA   Inktomi, Scooter, Site*Sucker
>           RobotIP   209.135.65, 64.172.5
> 
>   After that, it is all automatic. mv_tmp_session gets set to one, the
>   Scratch values mv_no_session_id and mv_no_count are set to one, and
>   normal pages don't get IDs put out by area.
> 
>   What this will do for the user:
> 
>                 1. Allow those UAs to follow a URL.
>                 2. Prevent useless session files from cluttering disk
>                 3. Prevent session writes from inhibiting disk performance.
> 

And:

>   We should probably allow a Profile to be run based on a Robot match
>   too. I will think about that.
> 

All interested parties should think about what actions should happen on
a RobotUA (or IP) match. I think the best candidates are a Profile run,
which would allow setting config variables a certain way based upon it
being a robot, and an Autoload macro, which would allow setting up
things to prevent logging and other intrusive operations.

Another possibility is to just simply reject certain robots:

RobotUA  <<EOF
 Inktomi=profile:robot,
 Scooter=profile:robot;autoload=scooterize,
 Site*Sucker=bounce:/you_turd_you.html
EOF

-- 
Mike Heins
Perusion -- Expert Interchange Consulting    http://www.perusion.com/
phone +1.513.523.7621      <mike@perusion.com>

Research is what I'm doing when I don't know what I'm doing.
-- Wernher Von Braun