[ic] Rolling big tables (mysql)

Grant emailgrant at gmail.com
Wed Apr 11 11:49:53 EDT 2007


> > I do keep a separate table of robot UAs and match traffic rows to them
> > with op=eq to populate another table with robot IPs and non-robot IPs
> > for the day to speed up the report.  Don't you think it would be
> > slower to match/no-match each IC request to a known robot UA and write
> > to the traffic table based on that, instead of unconditionally writing
> > all requests to the traffic table?  If not, excluding the robot
> > requests from the traffic table would mean a lot less processing for
> > the report and a lot fewer records for the traffic table.
> >
> Perhaps you should create a column called "spider" in the traffic table
> and save a true or false value depending upon the [data session spider]
> value.  You can then generate reports "WHERE spider = 0", for ordinary
> users, or "WHERE spider = 1" for robots etc.  An index on the spider column
> would be nice, of course.

Have you gotten comfortable using a partial match to determine a robot
UA?  I used to use RobotUA but I ended up wanting to make exact
matches.

Indexing sounds like something I need to make use of.  Is that an
mysql convention handled completely outside of IC?

> Then again, I wouldn't save traffic data to a table anyway.  I'd use
> usertrack and/or the apache access_log for that.  There are lots of tools
> that will allow you to analyse Apache log files.  You can even save some
> Interchange usertrack info into a custom Apache access_log file.

All of my domains run from the same catalog and I don't use
directories or query strings in the URL so I need something that will
allow me to track the domain involved with each request.  Can I save
the domain into a custom access_log?  If so, do you know of an
analyzer that would allow me to report on that domain info?

Alternatively, the usertrack file might work if I can save and report
on the domain there.  How can I parse a flat log file like that?

- Grant


More information about the interchange-users mailing list