[LON-CAPA-admin] Slowness in LonCapa
Stuart Raeburn
raeburn at msu.edu
Mon Jun 22 13:47:23 EDT 2015
Hi Maged,
> Two days ago we got some reports about slowness in LonCapa
> 1. The discussion interface where we are seeing very long duration
> associated with both discussion posts and requests
I switched a session to the library server for the uiuc domain, and
submitted a discussion post to a resource in one of my courses, and
did not find that it was slow. Is this an issue on all servers in
your domain (including the library server)?
> 2. Publishing, although not for all users. The one example I have,
> is that a power user made minor edits on 11 or 12 resources and
> republished them,
For each publication event I would look at the web server log to find
the date/time recorded for processing of the "Phase Two" call to
/adm/publish (i.e., after the "Finalize Publication" button was
pressed), and the date/time recorded in the .log file in the Authoring
Space in /home/httpd/html/priv/domain/user/ for the particular
resource within the heading:
================= Publish ||date/time|| Phase Two ================
There is an $r->rflush call to send output to the client after
completion of the following items (logged in the .log file):
Write metadata file for ||filename||
Wrote metadata
Synchronized SQL metadata database
Removing error messages: ok
Creating old version ||number||
Copied old target to ||path||
Copied old target metadata to ||versioned metadata path||
Copied original source to ||resource path||
Copied original metadata to ||metadata path||
The corresponding output sent to the browser would end with:
Copied metadata
Thereafter, actions are added to the PerlCleanupHandler phase to
notify subscribed servers.
That phase occurs after logging to the Apache web server logs, and
after return of the response to the browser, so it should not factor
into the time delays reported. However, you might also look at the
last modification time recorded for the .log file itself (which would
be when the last subscription update response was written to the .log
file).
> root at library1:/home/httpd/perl/logs# grep "CRITICAL: Forking server
> for s10.lite.msu.edu" lonc.log
>
> 23 times from around 10am till 5pm
There is nothing unusual in that. It tells me that your particular
server needed to make a connection to the msu library server to
request data. This could be for a number of reasons, including
display of the Roles page by someone with a web session on your
server, who has one or more roles in the msu domain, or browsing the
msu resource space in the cross-institutional content repository.
After the CRITICAL: Forking server for s10.lite.msu.edu you should see:
SUCCESS: Created connection 1 to host s10.lite.msu.edu
INFO: Connected to lond version: 489
SUCCESS: Connection 1 to s10.lite.msu.edu now ready for action
> I am assuming that the balancer should have a lot less needs for
> memory and CPU? As it only directs traffic to one of the access
> servers?
Correct, the LON-CAPA balancer requires a relatively small amount of
memory and CPUs, since it will switch a session for an authenticated
user to another server, as determined by the configuration in your
domain.
At MSU, sessions for faculty and users with author/co-author roles in
the msu domain are switched to the library server (s10) whereas other
users are switched from the balancer to the least busy of the four
access servers in the msu domain.
Stuart Raeburn
LON-CAPA Academic Consortium
Quoting "Abdel Messeh, Maged" <mmesseh at illinois.edu>:
> Hi Stuart,
>
> Two days ago we got some reports about slowness in LonCapa, I
> checked the server resources and nothing looked out of the normal.
>
> This slowness was exhibited by two behaviors:
>
> 1. The discussion interface where we are seeing very long duration
> associated with both discussion posts and requests (this behavior
> apparently existed before we upgraded to 2.11.1 and deployed our
> lonBalancer).
> 2. Publishing, although not for all users. The one example I have,
> is that a power user made minor edits on 11 or 12 resources and
> republished them, here are the times that he got from the time he
> hit "finalize publication" to the time the return page showed up
> (all in seconds):
>
> 40
> 40
> 130
> 90
> 15
> 15
> 15
> 8
> 90
> 15
> 12
>
> While there is of course some variation in complexity of the prelabs
> he was publishing, we probably should not see that much variation
> in times. The problem published was probably as complicated as any
> of the others.
>
> Looking through the logs I noticed several times with:
>
> root at library1:/home/httpd/perl/logs# grep "CRITICAL: Forking server
> for s10.lite.msu.edu" lonc.log
>
> 23 times from around 10am till 5pm
>
> I would appriocate any insights for where I can look for the cause
> of this problem.
>
>
> Also on the resource allocation, we have:
> Access servers - 12G RAM and 4vCPUs
> Library server - 10G RAM and 8vCPUs
>
> I am assuming that the balancer should have a lot less needs for
> memory and CPU? As it only directs traffic to one of the access
> servers?
>
> Many thanks,
>
> Maged
>
>
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
More information about the LON-CAPA-admin
mailing list