[LON-CAPA-admin] Access Server Problems
Stuart Raeburn
raeburn at msu.edu
Thu Oct 30 02:30:05 EDT 2008
Todd,
My guess is that within your internal network, lc1.mines.edu appears
to have an IP address of 138.67.1.77 (host 138.67.1.77 reports:
mica.Mines.EDU) when attempting to set up a connection, whereas
outside the network it is mapping to 138.67.38.59 (which is what host
lc1.mines.edu says it should be).
Perhaps this is a consequence of the configuration for your virtualization?
I say this because http://lc2.mines.edu/lon-status/ includes the
following from lond.log
Wed Oct 29 05:00:50 2008 (6782): WARNING: Rejected client 138.67.1.77,
closing connection Wed Oct 29 05:00:50 2008 (6782): CRITICAL:
Disconnect from 138.67.1.77 () Wed Oct 29 05:00:50 2008 (9055): Child
6782 died Wed Oct 29 05:02:50 2008 (9055): Attempting to start child
(IO::Socket::INET=GLOB(0x8d3248c)) Wed Oct 29 05:02:50 2008 (6785):
WARNING: Unknown client 138.67.1.77 Wed Oct 29 05:02:50 2008 (6785):
WARNING: Rejected client 138.67.1.77, closing connection Wed Oct 29
05:02:50 2008 (6785): CRITICAL: Disconnect from 138.67.1.77 () Wed Oct
29 05:02:50 2008 (9055): Child 6785 died Wed Oct 29 05:04:50 2008
(9055): Attempting to start child \CRITICAL: Disconnect from
138.67.1.77 () Wed Oct 29 05:08:50 2008 (9055): Unable to determine
who caller was, getpeername returned nothing Wed Oct 29 05:08:50 2008
(9055): Unable to determine clientip Wed Oct 29 05:08:50 2008 (9055):
Child 8939 died
http://loncapa.mines.edu/lon-status/ has the same in lond.log from
September 18th (the most recent file).
Is loncron not being run nightly by cron at 5.10 am on your library
server: loncapa.mines.edu? The date at the top of this lon-status page
reads:
LON Status Report csml1
Thu Sep 18 05:10:58 2008
By contrast http://lc1.mines.edu/lon-status/ reports connections to
servers in other domains (binghamton, uiuc) for October 29.
I verified that I could login to csma1 with a username from the msu
domain, and view resources in the msu domain, so it seems that
connections to/from other servers are functional, just not the
connection to csml1. The "Userfile repcopy failed .." entries in
lonnet.log are expected in cases where LON-CAPA is attempting to
retrieve a non-existent .meta file (e.g., for a .sequence file in
uploaded/$dom/...).
If the strange IP address - 138.67.1.77- is not relevant to this issue
then I think you may need to set the $DebugLevel initally to 1 in
/home/httpd/perl/loncnew on csma1 and then restart loncontrol. Look in
lonc.log for the debug mesages, and increment the level by 1 up to a
maximum of 8 or 9 (with loncontrol restarts) until you see something
useful in lonc.log where a connection is attempted with csml1. and
fails.
You might also want to set LoadLim to a value less than 1 in
/etc/httpd/conf/loncapa.conf so that your server is not hosting
sessions from other domains while you attempt to determine why csma1
and csml1 can not sustain a TCP/IP connection via port 5663.
The only other discrepancy I found is that cmsa1 has an IP of
lc1.Mines.EDU according to dns_hosts.tab info retrieved from the
LON-CAPA DNS servers at SFU, UIUC and MSU, but it looks as though the
IP is lc1.mines.edu in the hosts.tab file on csma1. It doen't seem
that this case difference should have any effect though.
Stuart Raeburn
MSU LON-CAPA group
Quoting Todd Ruskell <truskell at mines.edu>:
> Mark,
>
> Thanks for the ideas. Totally clean start. For better or worse, our
> server guys are experimenting with VMs running on top of VMWare's ESXi
> hypervisor, and they've started the LON-CAPA experiment with this access
> server. As far as I know, that extra layer shouldn't cause this type of
> problem, though.
>
> Once upon a time I had attempted the security certificate setup, and
> should likely do that again, but as of right now my servers allow both
> secure and insecure inter-server communications.
>
> Todd
>
> Mark Lucas wrote:
>> Todd,
>>
>> from a non-guru, did you rebuild totally from scratch, or did you
>> restore parts of the /home/httpd directory? (I always start access
>> servers totally fresh). Did you change machines underneath?
>>
>> I've confirmed that I can log in, access resources on oucapa2, but not
>> under the csm domain. Do you have security certificates set up that
>> might have caused grief?
>>
>> Mark
>>
>>
>> On Mon, 2008-10-27 at 23:45 -0600, Todd Ruskell wrote:
>>> So, this is a strange one for me. I'd appreciate any advice.
>>>
>>> We just rebuilt an access server, lc1.mines.edu (csma1) with centOS 5.
>>> As far as I can tell, no errors on install, no missing dependencies.
>>> loncontrol and httpd start just fine. I *think* that ports 80, 8080,
>>> and 5663 should all be open on both machines. Our other access server,
>>> csma2 still works fine.
>>>
>>> Problem is, my library server can't find csma1. I'm also pretty sure
>>> that csma1 can't find the library server, either. I've included
>>> relevant output from the access server logs, below, for a representative
>>> attempt to log in with a kerberos account. I can't log in with an
>>> internally authenticated account, either. I don't get anything on the
>>> library server logs.
>>>
>>> Whenever I attempt to log in, I get "Username and/or password could not
>>> be authenticated" from my browser.
>>>
>>> Now, the funny thing is that csma1 is also in spare.tab, and as soon as
>>> it came back online it started accepting logins from binghamton (I
>>> included a couple lines in lonnet.log to show this), including
>>> authorization. So csma1 will, in fact, talk to some members of the
>>> network in some fashion. Now, I also got a *lot* of "Userfile repcopy
>>> failed" messages appearing in lonnet.log. Perhaps that means that I
>>> don't have full communication?
>>>
>>> But, /home/httpd/html/res is populated by resources, presumably from the
>>> chemistry courses the binghamton users were attempting to access.
>>>
>>> I patiently await the word of the almighty gurus.
>>>
>>> Thanks,
>>>
>>> Todd
>>>
>>>
>>> lonc.log:
>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>> Mon Oct 27 22:58:12 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>> lond.</font>
>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries remaining:
>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>>
>>> lonhttpd.log:
>>> 138.67.129.10 - - [27/Oct/2008:22:58:17 +0000] "GET
>>> /adm/lonInterFace/student.jpg HTTP/1.1" 200 32041
>>> "http://lc1.mines.edu/adm/authenticate" "
>>> Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.17) Gecko/20080924
>>> Ubuntu/8.04 (hardy) Firefox/2.0.0.17"
>>>
>>> lonnet.log
>>> Mon Oct 27 14:51:58 2008 (5160): Userfile repcopy failed for
>>> uploaded/binghamton/1A2474087a384482fbinghamtonl1/group_allfolders.sequence.meta
>>> Mon Oct 27 19:34:26 2008 (27182): User xu7418 at binghamton authorized
>>> by binghamtonl1
>>> Mon Oct 27 22:58:12 2008 (10233): Trying to reconnect lonc
>>> Mon Oct 27 22:58:12 2008 (10233): lonc at pid 26832 responding,
>>> sending USR1
>>> Mon Oct 27 22:58:16 2008 (10233): User truskell at csm is unknown in
>>> authenticate
>>>
>>>
>>
>> _______________________________________________
>> LON-CAPA-admin mailing list
>> LON-CAPA-admin at mail.lon-capa.org
>> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>
> --
> Dr. Todd Ruskell
> Senior Lecturer, Department of Physics Office: Meyer Hall 326
> Colorado School of Mines Phone: 303-384-2080
> 1523 Illinois Street Fax: 303-273-3919
> Golden, CO 80401
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>
More information about the LON-CAPA-admin
mailing list