[LON-CAPA-admin] Access Server Problems

Stuart Raeburn raeburn at msu.edu
Thu Oct 30 02:30:05 EDT 2008


Todd,

My guess is that within your internal network, lc1.mines.edu appears  
to have an IP address of 138.67.1.77 (host 138.67.1.77 reports:  
mica.Mines.EDU) when attempting to set up a connection, whereas  
outside the network it is mapping to 138.67.38.59 (which is what host  
lc1.mines.edu says it should be).

Perhaps this is a consequence of the configuration for your virtualization?
I say this because http://lc2.mines.edu/lon-status/ includes the  
following from lond.log

Wed Oct 29 05:00:50 2008 (6782): WARNING: Rejected client 138.67.1.77,  
closing connection Wed Oct 29 05:00:50 2008 (6782): CRITICAL:  
Disconnect from 138.67.1.77 () Wed Oct 29 05:00:50 2008 (9055): Child  
6782 died Wed Oct 29 05:02:50 2008 (9055):  Attempting to start child  
(IO::Socket::INET=GLOB(0x8d3248c)) Wed Oct 29 05:02:50 2008 (6785):  
WARNING: Unknown client 138.67.1.77 Wed Oct 29 05:02:50 2008 (6785):  
WARNING: Rejected client 138.67.1.77, closing connection Wed Oct 29  
05:02:50 2008 (6785): CRITICAL: Disconnect from 138.67.1.77 () Wed Oct  
29 05:02:50 2008 (9055): Child 6785 died Wed Oct 29 05:04:50 2008  
(9055):  Attempting to start child \CRITICAL: Disconnect from  
138.67.1.77 () Wed Oct 29 05:08:50 2008 (9055): Unable to determine  
who caller was, getpeername returned nothing Wed Oct 29 05:08:50 2008  
(9055): Unable to determine clientip Wed Oct 29 05:08:50 2008 (9055):  
Child 8939 died

http://loncapa.mines.edu/lon-status/ has the same in lond.log from  
September 18th (the most recent file).

Is loncron not being run nightly by cron at 5.10 am on your library  
server: loncapa.mines.edu? The date at the top of this lon-status page  
reads:

LON Status Report csml1
Thu Sep 18 05:10:58 2008

By contrast http://lc1.mines.edu/lon-status/ reports connections to  
servers in other domains (binghamton, uiuc) for October 29.

I verified that I could login to csma1 with a username from the msu  
domain, and view resources in the msu domain, so it seems that  
connections to/from other servers are functional, just not the  
connection to csml1.  The "Userfile repcopy failed .." entries in  
lonnet.log are expected in cases where LON-CAPA is attempting to  
retrieve a non-existent .meta file (e.g., for a .sequence file in  
uploaded/$dom/...).

If the strange IP address - 138.67.1.77- is not relevant to this issue  
then I think you may need to set the $DebugLevel initally to 1 in  
/home/httpd/perl/loncnew on csma1 and then restart loncontrol. Look in  
lonc.log for the debug mesages, and increment the level by 1 up to a  
maximum of 8 or 9 (with loncontrol restarts) until you see something  
useful in lonc.log where a connection is attempted with csml1. and  
fails.

You might also want to set LoadLim to a value less than 1 in  
/etc/httpd/conf/loncapa.conf so that your server is not hosting  
sessions from other domains while you attempt to determine why csma1  
and csml1 can not sustain a TCP/IP connection via port 5663.

The only other discrepancy I found is that cmsa1 has an IP of  
lc1.Mines.EDU according to dns_hosts.tab info retrieved from the  
LON-CAPA DNS servers at SFU, UIUC and MSU, but it looks as though the  
IP is lc1.mines.edu in the hosts.tab file on csma1.  It doen't seem  
that this case difference should have any effect though.

Stuart Raeburn
MSU LON-CAPA group



Quoting Todd Ruskell <truskell at mines.edu>:

> Mark,
>
> Thanks for the ideas.  Totally clean start.  For better or worse, our
> server guys are experimenting with VMs running on top of VMWare's ESXi
> hypervisor, and they've started the LON-CAPA experiment with this access
> server.  As far as I know, that extra layer shouldn't cause this type of
> problem, though.
>
> Once upon a time I had attempted the security certificate setup, and
> should likely do that again, but as of right now my servers allow both
> secure and insecure inter-server communications.
>
> Todd
>
> Mark Lucas wrote:
>> Todd,
>>
>> from a non-guru, did you rebuild totally from scratch, or did you
>> restore parts of the /home/httpd directory? (I always start access
>> servers totally fresh). Did you change machines underneath?
>>
>> I've confirmed that I can log in, access resources on oucapa2, but not
>> under the csm domain. Do you have security certificates set up that
>> might have caused grief?
>>
>> Mark
>>
>>
>> On Mon, 2008-10-27 at 23:45 -0600, Todd Ruskell wrote:
>>> So, this is a strange one for me.  I'd appreciate any advice.
>>>
>>> We just rebuilt an access server, lc1.mines.edu (csma1) with centOS 5.
>>> As far as I can tell, no errors on install, no missing dependencies.
>>> loncontrol and httpd start just fine.  I *think* that ports 80, 8080,
>>> and 5663 should all be open on both machines.  Our other access server,
>>> csma2 still works fine.
>>>
>>> Problem is, my library server can't find csma1.  I'm also pretty sure
>>> that csma1 can't find the library server, either.  I've included
>>> relevant output from the access server logs, below, for a representative
>>> attempt to log in with a kerberos account.  I can't log in with an
>>> internally authenticated account, either.  I don't get anything on the
>>> library server logs.
>>>
>>> Whenever I attempt to log in, I get "Username and/or password could not
>>> be authenticated" from my browser.
>>>
>>> Now, the funny thing is that csma1 is also in spare.tab, and as soon as
>>> it came back online it started accepting logins from binghamton (I
>>> included a couple lines in lonnet.log to show this), including
>>> authorization.  So csma1 will, in fact, talk to some members of the
>>> network in some fashion.  Now, I also got a *lot* of "Userfile repcopy
>>> failed" messages appearing in lonnet.log.  Perhaps that means that I
>>> don't have full communication?
>>>
>>> But, /home/httpd/html/res is populated by resources, presumably from the
>>> chemistry courses the binghamton users were attempting to access.
>>>
>>> I patiently await the word of the almighty gurus.
>>>
>>> Thanks,
>>>
>>> Todd
>>>
>>>
>>> lonc.log:
>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>> Mon Oct 27 22:58:12 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>>
>>> lonhttpd.log:
>>> 138.67.129.10 - - [27/Oct/2008:22:58:17 +0000] "GET
>>> /adm/lonInterFace/student.jpg HTTP/1.1" 200 32041
>>> "http://lc1.mines.edu/adm/authenticate" "
>>> Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.17) Gecko/20080924
>>> Ubuntu/8.04 (hardy) Firefox/2.0.0.17"
>>>
>>> lonnet.log
>>> Mon Oct 27 14:51:58 2008 (5160): Userfile repcopy failed for
>>> uploaded/binghamton/1A2474087a384482fbinghamtonl1/group_allfolders.sequence.meta
>>> Mon Oct 27 19:34:26 2008 (27182): User xu7418 at binghamton authorized
>>> by binghamtonl1
>>> Mon Oct 27 22:58:12 2008 (10233): Trying to reconnect lonc
>>> Mon Oct 27 22:58:12 2008 (10233): lonc at pid 26832 responding,   
>>> sending USR1
>>> Mon Oct 27 22:58:16 2008 (10233): User truskell at csm is unknown in
>>> authenticate
>>>
>>>
>>
>> _______________________________________________
>> LON-CAPA-admin mailing list
>> LON-CAPA-admin at mail.lon-capa.org
>> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>
> --
> Dr. Todd Ruskell
> Senior Lecturer, Department of Physics       Office:  Meyer Hall 326
> Colorado School of Mines                     Phone: 303-384-2080
> 1523 Illinois Street                         Fax: 303-273-3919
> Golden, CO 80401
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>





More information about the LON-CAPA-admin mailing list