[LON-CAPA-admin] lonsql_errors revisited with more info
Hall, Jon
jhall at btcatholic.org
Wed Mar 26 11:19:43 EDT 2008
Problem overview:
For approximately the last year we have had an intermittent issue where
the /home/httpd/perl/logs/lonsql_errors file grows to occupy all
available disk space - in a matter of a few minutes/hours (i.e. NOT over
several days). This error has occurred on two completely different
systems (different physical hardware and both FC6 and FC7). Recently we
have seen the problem occur when running parse_activity although we have
been able to run it successfully in the past (it is unknown if the
parse_activity was to blame for any but the two most recent occurrences
of the issue). The lonsql_errors file contains HUGE sql statements
(running 'tail' on the file to only output the last 10 lines produces 21
Mb of output (that's right, 10 lines of the file contain 21 MEGAbytes of
data).
Here's an example of the last episode:
[root at lon-capa ~]# ps aux | grep parse
www 19358 96.5 3.8 87348 80056 ? R 15:14 33:31
parse_activity_log.pl: 9r136551aae81475cBTl1 at BT loading existing data
[root at lon-capa ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 19G 2.3G 16G 13% /
/dev/sda6 9.5G 1.3G 7.8G 15% /var
/dev/sda5 36G 34G 0 100% /home
/dev/sda1 99M 17M 77M 18% /boot
tmpfs 1013M 0 1013M 0% /dev/shm
[root at lon-capa ~]# cd /home/httpd/perl/logs/
[root at lon-capa logs]# ls -lh | grep lonsql_errors
-rw-r--r-- 1 www www 32G 2008-03-25 15:50 lonsql_errors
[root at lon-capa logs]# lsof | grep lonsql_errors
lonsql 12846 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
lonsql 12847 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
lonsql 12849 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
lonsql 12851 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
lonsql 12852 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
parse_act 19358 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
sh 19361 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
cat 19362 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
gzip 19363 www 2w REG 8,5 34288336896
5066139 /home/httpd/perl/logs/lonsql_errors
At this point we did the following to rectify the situation:
/etc/init.d/loncontrol stop
/etc/init.d/httpd stop
kill 19358 (it wasn't killed by stopping the above)
rm /home/httpd/perl/logs/lonsql_errors
/etc/init.d/loncontrol start
/etc/init.d/httpd start
This machine is a P4 2.4Ghz with 2Gb of ram, 2Gb of swap, and the disk
space noted above (/home normally uses 1.8G of the 36G total so less
than 10%). OS is Fedora - FC7 with all updates as of 3/21/08.
A gzip compressed copy of the last 10 lines of lonsql_errors (1.5mb
compressed, 21mb uncompressed) is available at
http://lon-capa.btcatholic.org/bt-lonsql_errors.gz
Any help would be greatly appreciated. Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.lon-capa.org/pipermail/lon-capa-admin/attachments/20080326/c90ad96d/attachment.html>
More information about the LON-CAPA-admin
mailing list