[ic] [ic]/[mv] Large Product tables

Barney Treadway barney@ecomshare.com
Wed, 10 Jan 2001 09:33:09 -0700

Thanks Mike,

So if I understand, Scenario 1 would be one instance of ic with the 50-100
product tables but searching and all operations needing to look at all
tables would instead be performed through a master indexed table. Once a sku
has been acquired from the indexed table, then we'd drop to the individual
product tables to perform presentation and purchase operations? The master
indexed table would contain the appropriate product table name where the sku
is located, allowing products to be called directly from the appropriate
product table itself rather than scanning all tables. That would keep
performance fine to that stage, but keeping ic from doing a call to the
product tables through the entire cart process seems a major alteration
though. Then I have to consider the corresponding pricing table that might
grow to cumbersome size as well.

Scenario 2 was running a running separate instance of ic for each of the
50 - 100 tables (keeping records there down to a few hundred thousand) and
using a master indexing and search outside of ic to then point the purchase
process into the ic corresponding to the product table. The fault here with
this seems to be when products are purchased from multiple tables, things
would fall apart? Unless..... the "on-fly" directive could be employed in
this case to prevent searching all product tables and rather drop in enough
data to allow checkout without a table access? Theoretically you could then
run checkout through any of the ic instances. Concerns would be passing
enough data to allow modifiers and pricing adjustment to be employed where
appropriate. One caveat would be ic's inherent ease of writing from all
instances to a single orderline and transaction table.

I don't quite follow the "category pointer reference"...

So finding a way to attach the product table name to the sku to facilitate
direct look up through the checkout and other processes with [data table
sku] would seem a good step.

Again, thanks for the insight. I'm going to push ahead with this in some
manner and I'll endeavor to keep folks informed as to load issues and
workarounds for catalogs of this size. I'd appreciate any feedback or words
of warning/encouragement. :-)


> Quoting Barney Treadway (barney@ecomshare.com):
> > Greetings all,
> >
> > We're looking at a mysql backed interchange with estimated 3-5 million
> > product records in short order. There's a natural break up in
> products to
> > allow us to create 50-100 product tables rather than a single
> massive table.
> > This would facilitate quicker updating etc from those editing/adding
> > products (which we do via php). Does anyone have any input (advice,
> > concerns, suggestions, threats, etc) as to interchange's ability to do
> > common loops and search regions on either 1) a large number of tables
> > (50-100) or 2) tables with millions of records?
> >
> > Any responses would be greatly appreciated.
> >
> There is really no way to do this well. It would require 50 queries
> instead of one. And 50-100 is a lot of table objects to maintain. It is
> too many ProductFiles for performance to be maintained in most cases.
> It would be much faster to index a large table on a field and
> display it that
> way. This is a logical break as well -- MySQL can do very fast selects
> on a field = 'string' indexed field.
> One possibility is to place only sku, category/owner pointer, and
> short description in the large table and break the subsidiary stuff
> into separate tables that would then be selected via the category
> pointer. This would be a fast table to update, while the separate
> tables that contain the extended data could be broken into pieces.
> Still, the more tables you have the more overhead. If you want to
> place a lot of load on the catalog you would have to encode your
> display routines in embedded Perl to get real performance.