[LON-CAPA-admin] Access Server Problems

Yuri Csapo ycsapo at mines.edu
Fri Oct 31 10:57:06 EDT 2008


Oops, sorry!

Todd Ruskell wrote:
> Stuart,
> 
> Yes, I did figure it out yesterday, but didn't have a chance to post back.
> 
> Boy, am I embarrassed, though. . . it turned out that the folks which
> built the box had put an entry for our library server in /etc/hosts, but
> had accidentally exchanged an 8 for a 6 in the ip address, which
> depending on fonts is really easy to miss when trying to trouble-shoot,
> especially when the two access servers *do* have a 6 for that digit. :(
> 
> Thanks for your help.  I now know a bit more about trying to
> troubleshoot loncapa connections, should I have to do it again in the
> future.
> 
> Todd
> 
> Stuart Raeburn wrote:
>> Todd,
>>
>> I can now browse /res/csm when logged into http://lc1.mines.edu so you
>> were evidently able to resolve the connection issue between your access
>> server and the csm library server.
>>
>> Stuart Raeburn
>> MSU LON-CAPA group
>>
>> Quoting Stuart Raeburn <raeburn at msu.edu>:
>>
>>> Todd,
>>>
>>> My guess is that within your internal network, lc1.mines.edu appears to
>>> have an IP address of 138.67.1.77 (host 138.67.1.77 reports:
>>> mica.Mines.EDU) when attempting to set up a connection, whereas outside
>>> the network it is mapping to 138.67.38.59 (which is what host
>>> lc1.mines.edu says it should be).
>>>
>>> Perhaps this is a consequence of the configuration for your
>>> virtualization?
>>> I say this because http://lc2.mines.edu/lon-status/ includes the
>>> following from lond.log
>>>
>>> Wed Oct 29 05:00:50 2008 (6782): WARNING: Rejected client 138.67.1.77,
>>> closing connection Wed Oct 29 05:00:50 2008 (6782): CRITICAL:
>>> Disconnect from 138.67.1.77 () Wed Oct 29 05:00:50 2008 (9055): Child
>>> 6782 died Wed Oct 29 05:02:50 2008 (9055):  Attempting to start child
>>> (IO::Socket::INET=GLOB(0x8d3248c)) Wed Oct 29 05:02:50 2008 (6785):
>>> WARNING: Unknown client 138.67.1.77 Wed Oct 29 05:02:50 2008 (6785):
>>> WARNING: Rejected client 138.67.1.77, closing connection Wed Oct 29
>>> 05:02:50 2008 (6785): CRITICAL: Disconnect from 138.67.1.77 () Wed Oct
>>> 29 05:02:50 2008 (9055): Child 6785 died Wed Oct 29 05:04:50 2008
>>> (9055):  Attempting to start child \CRITICAL: Disconnect from
>>> 138.67.1.77 () Wed Oct 29 05:08:50 2008 (9055): Unable to determine who
>>> caller was, getpeername returned nothing Wed Oct 29 05:08:50 2008
>>> (9055): Unable to determine clientip Wed Oct 29 05:08:50 2008 (9055):
>>> Child 8939 died
>>>
>>> http://loncapa.mines.edu/lon-status/ has the same in lond.log from
>>> September 18th (the most recent file).
>>>
>>> Is loncron not being run nightly by cron at 5.10 am on your library
>>> server: loncapa.mines.edu? The date at the top of this lon-status page
>>> reads:
>>>
>>> LON Status Report csml1
>>> Thu Sep 18 05:10:58 2008
>>>
>>> By contrast http://lc1.mines.edu/lon-status/ reports connections to
>>> servers in other domains (binghamton, uiuc) for October 29.
>>>
>>> I verified that I could login to csma1 with a username from the msu
>>> domain, and view resources in the msu domain, so it seems that
>>> connections to/from other servers are functional, just not the
>>> connection to csml1.  The "Userfile repcopy failed .." entries in
>>> lonnet.log are expected in cases where LON-CAPA is attempting to
>>> retrieve a non-existent .meta file (e.g., for a .sequence file in
>>> uploaded/$dom/...).
>>>
>>> If the strange IP address - 138.67.1.77- is not relevant to this issue
>>> then I think you may need to set the $DebugLevel initally to 1 in
>>> /home/httpd/perl/loncnew on csma1 and then restart loncontrol. Look in
>>> lonc.log for the debug mesages, and increment the level by 1 up to a
>>> maximum of 8 or 9 (with loncontrol restarts) until you see something
>>> useful in lonc.log where a connection is attempted with csml1. and
>>> fails.
>>>
>>> You might also want to set LoadLim to a value less than 1 in
>>> /etc/httpd/conf/loncapa.conf so that your server is not hosting
>>> sessions from other domains while you attempt to determine why csma1
>>> and csml1 can not sustain a TCP/IP connection via port 5663.
>>>
>>> The only other discrepancy I found is that cmsa1 has an IP of
>>> lc1.Mines.EDU according to dns_hosts.tab info retrieved from the
>>> LON-CAPA DNS servers at SFU, UIUC and MSU, but it looks as though the
>>> IP is lc1.mines.edu in the hosts.tab file on csma1.  It doen't seem
>>> that this case difference should have any effect though.
>>>
>>> Stuart Raeburn
>>> MSU LON-CAPA group
>>>
>>>
>>>
>>> Quoting Todd Ruskell <truskell at mines.edu>:
>>>
>>>> Mark,
>>>>
>>>> Thanks for the ideas.  Totally clean start.  For better or worse, our
>>>> server guys are experimenting with VMs running on top of VMWare's ESXi
>>>> hypervisor, and they've started the LON-CAPA experiment with this access
>>>> server.  As far as I know, that extra layer shouldn't cause this type of
>>>> problem, though.
>>>>
>>>> Once upon a time I had attempted the security certificate setup, and
>>>> should likely do that again, but as of right now my servers allow both
>>>> secure and insecure inter-server communications.
>>>>
>>>> Todd
>>>>
>>>> Mark Lucas wrote:
>>>>> Todd,
>>>>>
>>>>> from a non-guru, did you rebuild totally from scratch, or did you
>>>>> restore parts of the /home/httpd directory? (I always start access
>>>>> servers totally fresh). Did you change machines underneath?
>>>>>
>>>>> I've confirmed that I can log in, access resources on oucapa2, but not
>>>>> under the csm domain. Do you have security certificates set up that
>>>>> might have caused grief?
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On Mon, 2008-10-27 at 23:45 -0600, Todd Ruskell wrote:
>>>>>> So, this is a strange one for me.  I'd appreciate any advice.
>>>>>>
>>>>>> We just rebuilt an access server, lc1.mines.edu (csma1) with centOS 5.
>>>>>> As far as I can tell, no errors on install, no missing dependencies.
>>>>>> loncontrol and httpd start just fine.  I *think* that ports 80, 8080,
>>>>>> and 5663 should all be open on both machines.  Our other access
>>>>>> server,
>>>>>> csma2 still works fine.
>>>>>>
>>>>>> Problem is, my library server can't find csma1.  I'm also pretty sure
>>>>>> that csma1 can't find the library server, either.  I've included
>>>>>> relevant output from the access server logs, below, for a
>>>>>> representative
>>>>>> attempt to log in with a kerberos account.  I can't log in with an
>>>>>> internally authenticated account, either.  I don't get anything on the
>>>>>> library server logs.
>>>>>>
>>>>>> Whenever I attempt to log in, I get "Username and/or password could
>>>>>> not
>>>>>> be authenticated" from my browser.
>>>>>>
>>>>>> Now, the funny thing is that csma1 is also in spare.tab, and as
>>>>>> soon as
>>>>>> it came back online it started accepting logins from binghamton (I
>>>>>> included a couple lines in lonnet.log to show this), including
>>>>>> authorization.  So csma1 will, in fact, talk to some members of the
>>>>>> network in some fashion.  Now, I also got a *lot* of "Userfile repcopy
>>>>>> failed" messages appearing in lonnet.log.  Perhaps that means that I
>>>>>> don't have full communication?
>>>>>>
>>>>>> But, /home/httpd/html/res is populated by resources, presumably
>>>>>> from the
>>>>>> chemistry courses the binghamton users were attempting to access.
>>>>>>
>>>>>> I patiently await the word of the almighty gurus.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Todd
>>>>>>
>>>>>>
>>>>>> lonc.log:
>>>>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>>>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries
>>>>>> remaining:
>>>>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>>>>> lond.</font>
>>>>>> Mon Oct 27 22:58:09 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>>>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries
>>>>>> remaining:
>>>>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>>>>> Mon Oct 27 22:58:12 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>>>>> 22:29:21 2008: loncapa.Mines.EDU Connection count: 0 Retries
>>>>>> remaining:
>>>>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>>>>> lond.</font>
>>>>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>>>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries
>>>>>> remaining:
>>>>>> 5 ()] <font color='red'>CRITICAL: Failed to make a connection with
>>>>>> lond.</font>
>>>>>> Mon Oct 27 22:58:16 2008 (26966) [loncapa.Mines.EDU] [Mon Oct 27
>>>>>> 22:58:12 2008: loncapa.Mines.EDU Connection count: 0 Retries
>>>>>> remaining:
>>>>>> 5 ()] <font color='blue'>WARNING: Failing transaction sethost</font>
>>>>>>
>>>>>> lonhttpd.log:
>>>>>> 138.67.129.10 - - [27/Oct/2008:22:58:17 +0000] "GET
>>>>>> /adm/lonInterFace/student.jpg HTTP/1.1" 200 32041
>>>>>> "http://lc1.mines.edu/adm/authenticate" "
>>>>>> Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.17) Gecko/20080924
>>>>>> Ubuntu/8.04 (hardy) Firefox/2.0.0.17"
>>>>>>
>>>>>> lonnet.log
>>>>>> Mon Oct 27 14:51:58 2008 (5160): Userfile repcopy failed for
>>>>>> uploaded/binghamton/1A2474087a384482fbinghamtonl1/group_allfolders.sequence.meta
>>>>>>
>>>>>> Mon Oct 27 19:34:26 2008 (27182): User xu7418 at binghamton authorized
>>>>>> by binghamtonl1
>>>>>> Mon Oct 27 22:58:12 2008 (10233): Trying to reconnect lonc
>>>>>> Mon Oct 27 22:58:12 2008 (10233): lonc at pid 26832 responding,  
>>>>>> sending USR1
>>>>>> Mon Oct 27 22:58:16 2008 (10233): User truskell at csm is unknown in
>>>>>> authenticate
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> LON-CAPA-admin mailing list
>>>>> LON-CAPA-admin at mail.lon-capa.org
>>>>> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>>>> -- 
>>>> Dr. Todd Ruskell
>>>> Senior Lecturer, Department of Physics       Office:  Meyer Hall 326
>>>> Colorado School of Mines                     Phone: 303-384-2080
>>>> 1523 Illinois Street                         Fax: 303-273-3919
>>>> Golden, CO 80401
>>>> _______________________________________________
>>>> LON-CAPA-admin mailing list
>>>> LON-CAPA-admin at mail.lon-capa.org
>>>> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
>>>>
>>
>>
>> _______________________________________________
>> LON-CAPA-admin mailing list
>> LON-CAPA-admin at mail.lon-capa.org
>> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
> 

-- 
Yuri Csapo
Academic Computing & Networking
Colorado School of Mines
CT-256
Phone:  (303) 273-3503
Fax:      (303) 273-3475
Email:   ycsapo at mines.edu

Please use the following link to open a service request:
http://helpdesk.mines.edu
===========================================
With a PC, I always felt limited
by the software available.
On Unix, I am limited only by my knowledge.
--Peter J. Schoenster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2840 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.lon-capa.org/pipermail/lon-capa-admin/attachments/20081031/8ce54eb5/attachment.bin>


More information about the LON-CAPA-admin mailing list