[ic] How to determine cause of load spikes.

DB DB at m-and-d.com
Tue Nov 7 14:15:51 EST 2006


>> I'm running IC 5.4.1 using a mysql database of about 500,000 items on a
>> dual Xeon box with 4GB of RAM. Once in awhile I'll notice the site
>> become sluggish. During these periods the cpu load is always fairly
>> high.  I suspect these events are caused by an inefficient search on the
>> large products database. See below for an example 'top' output during a
>> recent event:
>> 
>> top - 11:26:03 up 31 days, 20:50,  1 user,  load average: 1.13, 0.97, 0.67
>> Tasks: 227 total,   2 running, 223 sleeping,   0 stopped,   2 zombie
>> Cpu(s): 25.1% us,  0.1% sy,  0.0% ni, 74.9% id,  0.0% wa,  0.0% hi,  0.0% si
>> Mem:   4040624k total,  3568748k used,   471876k free,   118736k buffers
>> Swap:  2031608k total,     8664k used,  2022944k free,  1453916k cached
>> 
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 26248 inter     25   0 1367m 1.2g 1880 R  100 32.4   0:47.43 interchange
>> 
>> So I can identify the offending process ID, but how can I determine what
>> this process is doing to cause such a load? If I can determine what
>> search is being run or which of my pages is being accessed then I can
>> probably correct the problem.
>> 
>> Bumping the RAM up to 4GB has drastically reduced the extent of the
>> problem, but I want to find and correct the real cause of the trouble.
>> 
>> DB
> 
> If you suspect a slow query then try turning on the mysql slow query  
> log. Let it run for a few days then optimize your most frequent and  
> slowest queries.
> 
> Bill Carr
> Bottlenose - Wine & Spirits eBusiness Specialists
> (877) 857-6700

Ok - I now have 3 entries in my slow query log:

Time                 Id Command    Argument
# Time: 061106 14:03:47
# User at Host: interch[interch] @ localhost []
# Query_time: 15  Lock_time: 0  Rows_sent: 530898  Rows_examined: 530898
use MDS_data;
select * from products;
# Time: 061106 16:04:19
# User at Host: interch[interch] @ localhost []
# Query_time: 11  Lock_time: 0  Rows_sent: 530898  Rows_examined: 530898
select * from products;
# Time: 061106 16:04:21
# User at Host: interch[interch] @ localhost []
# Query_time: 12  Lock_time: 0  Rows_sent: 530898  Rows_examined: 530898
select * from products;


There are indeed entries in my httpd access log at these times, but the
server is fairly busy so there are many entries close together in time.
I'm not certain if I should be looking at the exact times shown in the
slow query log, or a few seconds on either side.

Can anyone suggest a better way for me to identify exactly which page
was accessed causing the "select * from products;" query to be run?

DB


More information about the interchange-users mailing list