[LON-CAPA-admin] Failed to flush log buffer
raeburn at msu.edu
Wed Mar 6 12:50:48 EST 2019
In lonnet.pm, the flushcourselogs() routine is called in two situations.
1. When &courselog() is called *and* the length of the value in the %courselogs hash for the current courseID exceeds 4 kB.
2. When an Apache child is being shutdown (either because it has reached the max requests limit, or Apache is being reloaded/restarted etc.)
If the response received from the course's home server when attempting to send it the log command is anything other than 'ok', you'll see: "Failed to flush log buffer" in lonnet.log.
In case (1) this is not a problem because, the value saved for the current courseID key in the %courselogs hash is left as is, and will accumulate additional information for logging, until the next time &flushcourselogs() is called.
If the course's home server repeatedly responds to the log command with something other than 'ok', eventually the value in the hash will exceed the maximum size (40 kB), and an additional item will be logged in lonnet.log: "WARNING: Buffer for <courseID> exceeded maximum size, deleting".
If that happens, then the 40kB of course activity log data will not be stored in the activity.log file on the course's home server. If you encounter instances where a course's activity log appears to be missing items, which are present in the submissions recorded for a course, then that could be the cause, (e.g., see your post to the admin list from a couple of years ago -- mail.lon-capa.org/pipermail/lon-capa-admin/2017-July/003275.html ).
However, the more likely scenario, is that there was a temporary connection problem, and all the activity log data will eventually be sent soon after the reported failure.
Note: storage of student submissions etc. on the student's homeserver are handled differently. Storage of those are "critical" transactions, which means if they can not be completed they will be queued, and then re-sent once connection between the access node and the library node is re-established. The lonnet.perm.log file in /home/httpd/perl/logs will contain information about that type of delayed transaction ("D" for delayed, and "S" for sent, when later completed), so you might look in that file on your access server to see if there were any delayed transactions around the time of the: "Mar 2 15:33:17" failed flush of course logs from that server.
Looking at lonnet.log.N on the MSU nodes for past few days I see some instances of "Failed to flush log buffer" recorded on 3/5 between 1:50 am and 5:10 am for there courses in the uiuc domain: STAT 100 Spring 2019, CHEM 102C Spring 2019, and CHEM 103 Spring 2018.
Those logged messages are most likely the result of the maintenance on the uiuc domain servers on 3/5, advertised on the log-in page for your library server, which states: "LON-CAPA will be offline from 12:30-1:00 AM the morning of March 5th for maintenance".
LON-CAPA Academic Consortium
From: LON-CAPA-admin <lon-capa-admin-bounces at mail.lon-capa.org> on behalf of Bynum, Lee Hamilton <leebynum at illinois.edu>
Sent: Monday, March 4, 2019 5:03:30 PM
To: 'lon-capa-admin at mail.lon-capa.org'
Subject: [LON-CAPA-admin] Failed to flush log buffer
I’m doing some debugging and came across this in lonnet.log.
/home/httpd/perl/logs/lonnet.log.2:Sat Mar 2 15:33:17 2019 (77990): Failed to flush log buffer for uiuc_9K59659dde42f5c88uiuclibrary1
I’m seeing that it is generated by flushcourselogs() in lonnet.pm and indicates that a course transaction faile to be logged. However, it also seems to be related to last transaction times.
The log buffer not being flushed seems relatively minor, but I’m wondering if the last transaction times that are being interacted with in this function might have something to do with my Sudden Closings bug. Those involve course resources becoming unavailable isolated students across courses. I’m going to start tracing down that rabbit hole, but I wanted to reach out to see if anyone else has thoughts.
More information about the LON-CAPA-admin