[ic] Rolling big tables (mysql)

Tue Apr 10 12:24:16 EDT 2007

Grant wrote:
>> > How do you guys go about periodically rolling tables that get too big?
>> >  I could export the table, mv the .txt file, create a new .txt file
>> > with the appropriate header, rm the .sql file, and restart IC.  Does
>> > anyone have a better method they like?  I'm using mysql.
>> >
>> Well, it would depend upon the use to which you put your "big" tables,
>> although I have to say that your proposed deletion mechanism is far too
>> Heath Robinson for my taste. :-)
>>
>> My suggestion would be to include a date column with each row, and use
>> that as a condition of your periodic delete.  I.e. delete all rows that
>> are over x months/years old.  That'll work if the table is just used as
>> a log, or something similar, but may be inconvenient if the table holds
>> orders, or other information you'd prefer to keep on hand.  If you want
>> to dump the (to be deleted) rows to a file beforehand then you can
>> easily do so.  Again, using the date column as your selection criteria.
>>
>> You might want to save a summary of the (to be deleted) rows in
>> another table before deletion.  For instance, if you were deleting
>> orders that are over five (or whatever) years old then you could save
>> summary a row for every month.  The monthly row could summarise values
>> such as the number of orders, the total net/gross amounts, the average
>> net value of the orders and the tax revenue collected etc.  It would
>> probably take a long time for that table to be considered "big", and
>> you'd still have the requisite number of years worth of "real data" on
>> hand to keep the tax man happy.
>>
>> >From your proposed solution, it seems as if you just want to clear out
>> the table and give yourself a blank space to start over with.  In that
>> case, I wonder why you're saving rows into a table in the first place -
>> especially as you'll lose all benefit of being able to select rows from
>> that table once the periodic clearout kicks in.  If row selection is not
>> an issue, then wouldn't it be better to just save the rows into a text
>> file in the first place, and then use something like logrotate to keep
>> the file size under control?
>>
>> My view is usually that you either want the data or you don't.  If you
>> don't then either simply delete it when it gets old, or don't save it
>> in the first place.  If you do want to keep the data then either leave
>> it in the database or (if appropriate) delete it and keep a summary row
>> instead.
> 
> Hi Kevin,
> 
> Thank you for the insight.  The table in question actually logs
> traffic data from my site's visitors.  The bigger the table gets, the
> longer it takes to run a report on it.  I have a date field in there,
> and I'd like to backup and delete each row older than 30 days.  I'd
> prefer those rows to be backed up to another table as opposed to a log
> file so I can run a report on that old data if I need to.
> 
> Is there an ictag or two I can use to copy the old rows to a different
> table and delete them from the primary table?
> 
> - Grant

Hi Grant,

For something like this I would probably build a perl script to do the heavy lifting so that you are not within the constraints of the IC server and needing to complete in under PIDcheck seconds.

How many rows are in your table when it starts getting slow?  Are your fields properly indexed?

First create a table where the archived data will reside, such as "stats_archived".

Your script should:

1. Use DBD::MySQL to connect to your database and run queries.

2. Use MySQL's "SELECT INTO" with a where to restrict to only records over 30 days old, and select those records from the live table into the "stats_archived" table.

3. Then use MySQL's "DELETE FROM" with the same where to restrict the same record set and delete those from the live stats table.

4. The script then can be setup to run nightly through cron, or perhaps at the beginning of the month so that the previous month's data is available all month long.

Take it easy,

-- 
Ron Phipps
End Point Corporation
ron at endpoint.com