[LON-CAPA-admin] server goes unresponsive?

Todd Ruskell todd.ruskell at gmail.com
Sun Oct 7 23:10:58 EDT 2012


Hi,

We've been experiencing a situation in which our library server seems to
suddenly go unresponsive, without much warning.  In looking at logs, I see
a couple things.

First, in lonnet.log I see some entries like the following distributed
throughout the log file.  I don't know how to interpret these entries, but
they seem to indicate something isn't quite right:

Sun Oct  7 12:00:23 2012 (21138): Starting Shut down
Sun Oct  7 12:00:23 2012 (21138): %badServerCache      is 7
Sun Oct  7 12:00:23 2012 (21138): %homecache           is 13510
Sun Oct  7 12:00:23 2012 (21138): %remembered          is 7
Sun Oct  7 12:00:23 2012 (21138): kicks                is 0
Sun Oct  7 12:00:23 2012 (21138): hits                 is 451259
Sun Oct  7 12:00:23 2012 (21138): Flushing log buffers
Sun Oct  7 12:00:23 2012 (21138): Shutting down

When the system seems to go unresponsive, lonnet.log has the following
entries:

Sun Oct  7 12:07:17 2012 (21568): <font color="blue">WARNING: Trying to get
resource data for smarkoe at csm: con_lost</font>
Sun Oct  7 12:07:38 2012 (21871): <font color="blue">WARNING: Trying to get
resource data for gajohnso at csm: con_lost</font>

...above entry repeated several times and then several messages like ...

Sun Oct  7 12:07:40 2012 (21871): Could not devalidate spreadsheet esease
at csm
 for
uploaded/csm/6925421bc619b4f6bcsml1/default_1316619888.sequence___3___csm/c
smphyslib/P200_Materials/StudioActivities/Block2-Circuits/EnergyStoredInCapacito
r/readingQuestions.problem: no_such_host con_lost
Sun Oct  7 12:07:41 2012 (21498): Could not devalidate spreadsheet jsingh
at csm
 for
uploaded/csm/6925421bc619b4f6bcsml1/default_1317053874.sequence___15___csm/csmphyslib/P200_Materials/TestBank/Current_Resistance/RCPowerUpdated.problem:
error: 100 tie(GDBM) Failed while attempting del con_lost
Sun Oct  7 12:07:41 2012 (21899): Name loncapa.brs.cgresd.net no IP found
Sun Oct  7 12:07:43 2012 (21920): Name loncapa.brs.cgresd.net no IP found
Sun Oct  7 12:08:17 2012 (21920): Name
shs-dunk-lin.shs.sarasota.k12.fl.usno IP found
Sun Oct  7 12:08:18 2012 (21920): Name theoryx2.uwinnipeg.ca no IP found
Sun Oct  7 12:08:22 2012 (21871): <font color="blue">WARNING: Connection
buffer /home/httpd/sockets/delayed/1349633299.0.21871.storecsmesease.csml1:
store:csm:esease:csm_6925421bc619b4f6bcsml1:uploaded%2fcsm%2f6925421bc619b4f6bcsml1%2fdefault_1316619888%2esequence___3___csm%2fcsmphyslib%2fP200_Materials%2fStudioActivities%2fBlock2%2dCircuits%2fEnergyStoredInCapacitor%2freadingQuestions%2eproblem:resource%2echargedischarge%2eregrader=gajohnso%3acsm&resource%2echargedischarge%2eawarded=1&ip=138%2e67%2e159%2e119&resource%2echargedischarge%2esolved=correct_by_override&resource%2eexplodify%2eregrader=gajohnso%3acsm&resource%2eexplodify%2esolved=correct_by_override&resource%2eexplodify%2eawarded=1&host=csml1</font>
Sun Oct  7 12:08:23 2012 (21498): <font color="blue">WARNING: Connection
buffer
/home/httpd/sockets/delayed/1349633297.10.21498.storecsmjsingh.csml1:
store:csm:jsingh:csm_6925421bc619b4f6bcsml1:uploaded%2fcsm%2f6925421bc619b4f6bcsml1%2fdefault_1317053874%2esequence___15___csm%2fcsmphyslib%2fP200_Materials%2fTestBank%2fCurrent_Resistance%2fRCPowerUpdated%2eproblem:resource%2ePower%2esolved=incorrect_attempted&resource%2ePower%2e12%2eawarddetail=INCORRECT&resource%2ePower%2eaward=INCORRECT&ip=138%2e67%2e78%2e192&resource%2ePower%2etries=1&resource%2ePower%2e12%2esubmission=HalfCVSquared%3d2&host=csml1</font>
Sun Oct  7 12:08:25 2012 (21871): <font color="blue">WARNING: Trying to get
resource data for esease at csm: con_lost</font>

In lonc.log, I see:

Sun Oct  7 12:05:07 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:07
2012:
loncapa.Mines.EDU Connection count: 7 Retries remaining: 5 (local)] <font
color=
'green'>SUCCESS: Created connection 8 to host loncapa.Mines.EDU</font>
Sun Oct  7 12:05:07 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:07
2012:
loncapa.Mines.EDU Connection count: 8 Retries remaining: 5 (local)] <font
color=
'blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:05:07 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:07
2012:
loncapa.Mines.EDU Connection count: 8 Retries remaining: 5 (local)] <font
color=
'blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:05:16 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:16
2012:
loncapa.Mines.EDU Connection count: 7 Retries remaining: 5 (local)] <font
color=
'blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:05:16 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:16
2012:
loncapa.Mines.EDU Connection count: 7 Retries remaining: 5 (local)] <font
color=
'blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:05:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:18
2012:
loncapa.Mines.EDU Connection count: 6 Retries remaining: 5 (local)] <font
color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:05:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:18
2012: loncapa.Mines.EDU Connection count: 6 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:05:19 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:19
2012: loncapa.Mines.EDU Connection count: 5 Retries remaining: 4 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:05:19 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:05:19
2012: loncapa.Mines.EDU Connection count: 5 Retries remaining: 4 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:05:41 2012 (21849) [lc1.Mines.EDU] [Sun Oct  7 12:05:41 2012:
lc1.Mines.EDU Connection count: 1 Retries remaining: 5 (insecure)] <font
color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 4 Retries remaining: 5 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 4 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Failing transaction sethost</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 4 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 3 Retries remaining: 5 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 3 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Failing transaction sethost</font>
Sun Oct  7 12:06:10 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:10
2012: loncapa.Mines.EDU Connection count: 3 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:06:19 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:16
2012: loncapa.Mines.EDU Connection count: 2 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 3 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:40 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:40
2012: loncapa.Mines.EDU Connection count: 3 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 4 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:40 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:40
2012: loncapa.Mines.EDU Connection count: 3 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 5 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:41 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:41
2012: loncapa.Mines.EDU Connection count: 5 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 6 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:44 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:43
2012: loncapa.Mines.EDU Connection count: 6 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 7 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:46 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:46
2012: loncapa.Mines.EDU Connection count: 7 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 8 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:47 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:47
2012: loncapa.Mines.EDU Connection count: 8 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 9 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:47 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:47
2012: loncapa.Mines.EDU Connection count: 8 Retries remaining: 5 (local)]
<font color='green'>SUCCESS: Created connection 10 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:06:49 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:49
2012: loncapa.Mines.EDU Connection count: 10 Retries remaining: 5 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:06:49 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:49
2012: loncapa.Mines.EDU Connection count: 10 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:06:53 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:06:53
2012: loncapa.Mines.EDU Connection count: 9 Retries remaining: 4 (local)]
<font color='green'>SUCCESS: Created connection 10 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:07:11 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:11
2012: loncapa.Mines.EDU Connection count: 10 Retries remaining: 5 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:07:11 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:11
2012: loncapa.Mines.EDU Connection count: 10 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:07:11 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:11
2012: loncapa.Mines.EDU Connection count: 9 Retries remaining: 5 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:07:11 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:11
2012: loncapa.Mines.EDU Connection count: 9 Retries remaining: 5 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:07:11 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:11
2012: loncapa.Mines.EDU Connection count: 8 Retries remaining: 3 (local)]
<font color='green'>SUCCESS: Created connection 9 to host loncapa.Mines.EDU
</font>
Sun Oct  7 12:07:12 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:12
2012: loncapa.Mines.EDU Connection count: 9 Retries remaining: 3 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:07:12 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:12
2012: loncapa.Mines.EDU Connection count: 9 Retries remaining: 3 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:07:15 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:15
2012: loncapa.Mines.EDU Connection count: 8 Retries remaining: 2 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:07:15 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:15
2012: loncapa.Mines.EDU Connection count: 8 Retries remaining: 2 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU Connection count: 7 Retries remaining: 1 (local)]
<font color='blue'>WARNING: A socket timeout was detected</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU Connection count: 7 Retries remaining: 1 (local)]
<font color='blue'>WARNING: Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU Connection count: 6 Retries remaining: 1 (local)]
<font color='red'>CRITICAL: Host marked DEAD: loncapa.Mines.EDU</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:17 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:17
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: A socket
timeout was detected</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: Shutting
down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='red'>CRITICAL: Host marked
DEAD: loncapa.Mines.EDU</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: A socket
timeout was detected</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: Shutting
down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >> DEAD <<] <font color='red'>CRITICAL: Host marked
DEAD: loncapa.Mines.EDU</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:18 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:18
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:22 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:22
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:24 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:24
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: A socket
timeout was detected</font>
Sun Oct  7 12:07:24 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:24
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: Shutting
down a socket</font>
Sun Oct  7 12:07:24 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:24
2012: loncapa.Mines.EDU >> DEAD <<] <font color='red'>CRITICAL: Host marked
DEAD: loncapa.Mines.EDU</font>
Sun Oct  7 12:07:27 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:27
2012: loncapa.Mines.EDU >>> DEAD !!!! <<<] <font color='blue'>WARNING:
Shutting down a socket</font>
Sun Oct  7 12:07:35 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:35
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: A socket
timeout was detected</font>
Sun Oct  7 12:07:35 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:35
2012: loncapa.Mines.EDU >> DEAD <<] <font color='blue'>WARNING: Shutting
down a socket</font>
Sun Oct  7 12:07:35 2012 (18608) [loncapa.Mines.EDU] [Sun Oct  7 12:07:35
2012: loncapa.Mines.EDU >> DEAD <<] <font color='red'>CRITICAL: Host marked
DEAD: loncapa.Mines.EDU</font>


I strikes me as very much not good that my library server has been marked
"DEAD".  Any ideas on what can cause a host to be so marked?  It doesn't
appear to be load based, as it seems to have happened at a variety of load
levels, including pretty low.  Any help you can provide would be greatly
appreciated.

Thanks,
Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.lon-capa.org/pipermail/lon-capa-admin/attachments/20121008/0120ee01/attachment.html>


More information about the LON-CAPA-admin mailing list