[LON-CAPA-admin] lonsql_errors revisited with more info

Hall, Jon jhall at btcatholic.org
Wed Mar 26 11:19:43 EDT 2008


Problem overview:

For approximately the last year we have had an intermittent issue where
the /home/httpd/perl/logs/lonsql_errors file grows to occupy all
available disk space - in a matter of a few minutes/hours (i.e. NOT over
several days).  This error has occurred on two completely different
systems (different physical hardware and both FC6 and FC7).  Recently we
have seen the problem occur when running parse_activity although we have
been able to run it successfully in the past (it is unknown if the
parse_activity was to blame for any but the two most recent occurrences
of the issue).  The lonsql_errors file contains HUGE sql statements
(running 'tail' on the file to only output the last 10 lines produces 21
Mb of output (that's right, 10 lines of the file contain 21 MEGAbytes of
data).

 

Here's an example of the last episode:

 

 [root at lon-capa ~]# ps aux | grep parse

www      19358 96.5  3.8  87348 80056 ?        R    15:14  33:31
parse_activity_log.pl: 9r136551aae81475cBTl1 at BT loading existing data

 

 

 [root at lon-capa ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/sda2              19G  2.3G   16G  13% /

/dev/sda6             9.5G  1.3G  7.8G  15% /var

/dev/sda5              36G   34G     0 100% /home

/dev/sda1              99M   17M   77M  18% /boot

tmpfs                1013M     0 1013M   0% /dev/shm

 

 

 [root at lon-capa ~]# cd /home/httpd/perl/logs/

 [root at lon-capa logs]# ls -lh | grep lonsql_errors

-rw-r--r-- 1 www www  32G 2008-03-25 15:50 lonsql_errors

 

 [root at lon-capa logs]# lsof | grep lonsql_errors

lonsql    12846       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

lonsql    12847       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

lonsql    12849       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

lonsql    12851       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

lonsql    12852       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

parse_act 19358       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

sh        19361       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

cat       19362       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

gzip      19363       www    2w      REG        8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors

 

At this point we did the following to rectify the situation:

 

/etc/init.d/loncontrol stop

/etc/init.d/httpd stop

kill 19358 (it wasn't killed by stopping the above)

rm /home/httpd/perl/logs/lonsql_errors

/etc/init.d/loncontrol start

/etc/init.d/httpd start

 

This machine is a P4 2.4Ghz with 2Gb of ram, 2Gb of swap, and the disk
space noted above (/home normally uses 1.8G of the 36G total so less
than 10%).  OS is Fedora - FC7 with all updates as of 3/21/08.

 

A gzip compressed copy of the last 10 lines of lonsql_errors (1.5mb
compressed, 21mb uncompressed) is available at
http://lon-capa.btcatholic.org/bt-lonsql_errors.gz

 

Any help would be greatly appreciated.  Thanks!

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.lon-capa.org/pipermail/lon-capa-admin/attachments/20080326/c90ad96d/attachment.html>


More information about the LON-CAPA-admin mailing list