[LON-CAPA-cvs] cvs: loncom / lond

foxr lon-capa-cvs-allow@mail.lon-capa.org
Tue, 07 Oct 2008 10:08:12 -0000


This is a MIME encoded message

--foxr1223374092
Content-Type: text/plain

foxr		Tue Oct  7 06:08:12 2008 EDT

  Modified files:              
    /loncom	lond 
  Log:
  Documented log messages that can be emitted as POD entries.
  
  
  
--foxr1223374092
Content-Type: text/plain
Content-Disposition: attachment; filename="foxr-20081007060812.txt"

Index: loncom/lond
diff -u loncom/lond:1.408 loncom/lond:1.409
--- loncom/lond:1.408	Fri Sep  5 20:47:13 2008
+++ loncom/lond	Tue Oct  7 06:08:06 2008
@@ -2,7 +2,7 @@
 # The LearningOnline Network
 # lond "LON Daemon" Server (port "LOND" 5663)
 #
-# $Id: lond,v 1.408 2008/09/06 00:47:13 raeburn Exp $
+# $Id: lond,v 1.409 2008/10/07 10:08:06 foxr Exp $
 #
 # Copyright Michigan State University Board of Trustees
 #
@@ -59,7 +59,7 @@
 my $status='';
 my $lastlog='';
 
-my $VERSION='$Revision: 1.408 $'; #' stupid emacs
+my $VERSION='$Revision: 1.409 $'; #' stupid emacs
 my $remoteVERSION;
 my $currenthostid="default";
 my $currentdomainid;
@@ -7118,3 +7118,406 @@
 Server/Process
 
 =cut
+
+
+=pod
+
+=head1 LOG MESSAGES
+
+The messages below can be emitted in the lond log.  This log is located
+in ~httpd/perl/logs/lond.log  Many log messages have HTML encapsulation
+to provide coloring if examined from inside a web page. Some do not.
+Where color is used, the colors are; Red for sometihhng to get excited
+about and to follow up on. Yellow for something to keep an eye on to
+be sure it does not get worse, Green,and Blue for informational items.
+
+In the discussions below, sometimes reference is made to ~httpd
+when describing file locations.  There isn't really an httpd 
+user, however there is an httpd directory that gets installed in the
+place that user home directories go.  On linux, this is usually
+(always?) /home/httpd.
+
+
+Some messages are colorless.  These are usually (not always)
+Green/Blue color level messages.
+
+=over 2
+
+=item (Red)  LocalConnection rejecting non local: <ip> ne 127.0.0.1
+
+A local connection negotiation was attempted by
+a host whose IP address was not 127.0.0.1.
+The socket is closed and the child will exit.
+lond has three ways to establish an encyrption
+key with a client:
+
+=over 2
+
+=item local 
+
+The key is written and read from a file.
+This is only valid for connections from localhost.
+
+=item insecure 
+
+The key is generated by the server and
+transmitted to the client.
+
+=item  ssl (secure)
+
+An ssl connection is negotiated with the client,
+the key is generated by the server and sent to the 
+client across this ssl connection before the
+ssl connectionis terminated and clear text
+transmission resumes.
+
+=back
+
+=item (Red) LocalConnection: caller is insane! init = <init> and type = <type>
+
+The client is local but has not sent an initialization
+string that is the literal "init:local"  The connection
+is closed and the child exits.
+
+=item Red CRITICAL Can't get key file <error>        
+
+SSL key negotiation is being attempted but the call to
+lonssl::KeyFile  failed.  This usually means that the
+configuration file is not correctly defining or protecting
+the directories/files lonCertificateDirectory or
+lonnetPrivateKey
+<error> is a string that describes the reason that
+the key file could not be located.
+
+=item (Red) CRITICAL  Can't get certificates <error>  
+
+SSL key negotiation failed because we were not able to retrives our certificate
+or the CA's certificate in the call to lonssl::CertificateFile
+<error> is the textual reason this failed.  Usual reasons:
+
+=over 2
+       
+=item Apache config file for loncapa  incorrect:
+ 
+one of the variables 
+lonCertificateDirectory, lonnetCertificateAuthority, or lonnetCertificate
+undefined or incorrect
+
+=item Permission error:
+
+The directory pointed to by lonCertificateDirectory is not readable by lond
+
+=item Permission error:
+
+Files in the directory pointed to by lonCertificateDirectory are not readable by lond.
+
+=item Installation error:                         
+
+Either the certificate authority file or the certificate have not
+been installed in lonCertificateDirectory.
+
+=item (Red) CRITICAL SSL Socket promotion failed:  <err> 
+
+The promotion of the connection from plaintext to SSL failed
+<err> is the reason for the failure.  There are two
+system calls involved in the promotion (one of which failed), 
+a dup to produce
+a second fd on the raw socket over which the encrypted data
+will flow and IO::SOcket::SSL->new_from_fd which creates
+the SSL connection on the duped fd.
+
+=item (Blue)   WARNING client did not respond to challenge 
+
+This occurs on an insecure (non SSL) connection negotiation request.
+lond generates some number from the time, the PID and sends it to
+the client.  The client must respond by echoing this information back.
+If the client does not do so, that's a violation of the challenge
+protocols and the connection will be failed.
+
+=item (Red) No manager table. Nobody can manage!!    
+
+lond has the concept of privileged hosts that
+can perform remote management function such
+as update the hosts.tab.   The manager hosts
+are described in the 
+~httpd/lonTabs/managers.tab file.
+this message is logged if this file is missing.
+
+
+=item (Green) Registering manager <dnsname> as <cluster_name> with <ipaddress>
+
+Reports the successful parse and registration
+of a specific manager. 
+
+=item Green existing host <clustername:dnsname>  
+
+The manager host is already defined in the hosts.tab
+the information in that table, rather than the info in the
+manager table will be used to determine the manager's ip.
+
+=item (Red) Unable to craete <filename>                 
+
+lond has been asked to create new versions of an administrative
+file (by a manager).  When this is done, the new file is created
+in a temp file and then renamed into place so that there are always
+usable administrative files, even if the update fails.  This failure
+message means that the temp file could not be created.
+The update is abandoned, and the old file is available for use.
+
+=item (Green) CopyFile from <oldname> to <newname> failed
+
+In an update of administrative files, the copy of the existing file to a
+backup file failed.  The installation of the new file may still succeed,
+but there will not be a back up file to rever to (this should probably
+be yellow).
+
+=item (Green) Pushfile: backed up <oldname> to <newname>
+
+See above, the backup of the old administrative file succeeded.
+
+=item (Red)  Pushfile: Unable to install <filename> <reason>
+
+The new administrative file could not be installed.  In this case,
+the old administrative file is still in use.
+
+=item (Green) Installed new < filename>.                      
+
+The new administrative file was successfullly installed.                                               
+
+=item (Red) Reinitializing lond pid=<pid>                    
+
+The lonc child process <pid> will be sent a USR2 
+signal.
+
+=item (Red) Reinitializing self                                    
+
+We've been asked to re-read our administrative files,and
+are doing so.
+
+=item (Yellow) error:Invalid process identifier <ident>  
+
+A reinit command was received, but the target part of the 
+command was not valid.  It must be either
+'lond' or 'lonc' but was <ident>
+
+=item (Green) isValideditCommand checking: Command = <command> Key = <key> newline = <newline>
+
+Checking to see if lond has been handed a valid edit
+command.  It is possible the edit command is not valid
+in that case there are no log messages to indicate that.
+
+=item Result of password change for  <username> pwchange_success
+
+The password for <username> was
+successfully changed.
+
+=item Unable to open <user> passwd to change password
+
+Could not rewrite the 
+internal password file for a user
+
+=item Result of password change for <user> : <result>
+                                                                     
+A unix password change for <user> was attempted 
+and the pipe returned <result>  
+
+=item LWP GET: <message> for <fname> (<remoteurl>)
+
+The lightweight process fetch for a resource failed
+with <message> the local filename that should
+have existed/been created was  <fname> the
+corresponding URI: <remoteurl>  This is emitted in several
+places.
+
+=item Unable to move <transname> to <destname>     
+
+From fetch_user_file_handler - the user file was replicated but could not
+be mv'd to its final location.
+
+=item Looking for <domain> <username>              
+
+From user_has_session_handler - This should be a Debug call instead
+it indicates lond is about to check whether the specified user has a 
+session active on the specified domain on the local host.
+
+=item Client <ip> (<name>) hanging up: <input>     
+
+lond has been asked to exit by its client.  The <ip> and <name> identify the
+client systemand <input> is the full exit command sent to the server.
+
+=item Red CRITICAL: ABNORMAL EXIT. child <pid> for server <hostname> died through a crass with this error->[<message>].
+                                                 
+A lond child terminated.  NOte that this termination can also occur when the
+child receives the QUIT or DIE signals.  <pid> is the process id of the child,
+<hostname> the host lond is working for, and <message> the reason the child died
+to the best of our ability to get it (I would guess that any numeric value
+represents and errno value).  This is immediately followed by
+
+=item  Famous last words: Catching exception - <log> 
+
+Where log is some recent information about the state of the child.
+
+=item Red CRITICAL: TIME OUT <pid>                     
+
+Some timeout occured for server <pid>.  THis is normally a timeout on an LWP
+doing an HTTP::GET.
+
+=item child <pid> died                              
+
+The reaper caught a SIGCHILD for the lond child process <pid>
+This should be modified to also display the IP of the dying child
+$children{$pid}
+
+=item Unknown child 0 died                           
+A child died but the wait for it returned a pid of zero which really should not
+ever happen. 
+
+=item Child <which> - <pid> looks like we missed it's death 
+
+When a sigchild is received, the reaper process checks all children to see if they are
+alive.  If children are dying quite quickly, the lack of signal queuing can mean
+that a signal hearalds the death of more than one child.  If so this message indicates
+which other one died. <which> is the ip of a dead child
+
+=item Free socket: <shutdownretval>                
+
+The HUNTSMAN sub was called due to a SIGINT in a child process.  The socket is being shutdown.
+for whatever reason, <shutdownretval> is printed but in fact shutdown() is not documented
+to return anything. This is followed by: 
+
+=item Red CRITICAL: Shutting down                       
+
+Just prior to exit.
+
+=item Free socket: <shutdownretval>                 
+
+The HUPSMAN sub was called due to a SIGHUP.  all children get killsed, and lond execs itself.
+This is followed by:
+
+=item (Red) CRITICAL: Restarting                         
+
+lond is about to exec itself to restart.
+
+=item (Blue) Updating connections                        
+
+(In response to a USR2).  All the children (except the one for localhost)
+are about to be killed, the hosts tab reread, and Apache reloaded via apachereload.
+
+=item (Blue) UpdateHosts killing child <pid> for ip <ip>   
+
+Due to USR2 as above.
+
+=item (Green) keeping child for ip <ip> (pid = <pid>)    
+
+In response to USR2 as above, the child indicated is not being restarted because
+it's assumed that we'll always need a child for the localhost.
+
+
+=item Going to check on the children                
+
+Parent is about to check on the health of the child processes.
+Note that this is in response to a USR1 sent to the parent lond.
+there may be one or more of the next two messages:
+
+=item <pid> is dead                                 
+
+A child that we have in our child hash as alive has evidently died.
+
+=item  Child <pid> did not respond                   
+
+In the health check the child <pid> did not update/produce a pid_.txt
+file when sent it's USR1 signal.  That process is killed with a 9 signal, as it's
+assumed to be hung in some un-fixable way.
+
+=item Finished checking children                   
+ 
+Master processs's USR1 processing is cojmplete.
+
+=item (Red) CRITICAL: ------- Starting ------            
+
+(There are more '-'s on either side).  Lond has forked itself off to 
+form a new session and is about to start actual initialization.
+
+=item (Green) Attempting to start child (<client>)       
+
+Started a new child process for <client>.  Client is IO::Socket object
+connected to the child.  This was as a result of a TCP/IP connection from a client.
+
+=item Unable to determine who caller was, getpeername returned nothing
+                                                  
+In child process initialization.  either getpeername returned undef or
+a zero sized object was returned.  Processing continues, but in my opinion,
+this should be cause for the child to exit.
+
+=item Unable to determine clientip                  
+
+In child process initialization.  The peer address from getpeername was not defined.
+The client address is stored as "Unavailable" and processing continues.
+
+=item (Yellow) INFO: Connection <ip> <name> connection type = <type>
+                                                  
+In child initialization.  A good connectionw as received from <ip>.
+
+=over 2
+
+=item <name> 
+
+is the name of the client from hosts.tab.
+
+=item <type> 
+
+Is the connection type which is either 
+
+=over 2
+
+=item manager 
+
+The connection is from a manager node, not in hosts.tab
+
+=item client  
+
+the connection is from a non-manager in the hosts.tab
+
+=item both
+
+The connection is from a manager in the hosts.tab.
+
+=back
+
+=back
+
+=item (Blue) Certificates not installed -- trying insecure auth
+
+One of the certificate file, key file or
+certificate authority file could not be found for a client attempting
+SSL connection intiation.  COnnection will be attemptied in in-secure mode.
+(this would be a system with an up to date lond that has not gotten a 
+certificate from us).
+
+=item (Green)  Successful local authentication            
+
+A local connection successfully negotiated the encryption key. 
+In this case the IDEA key is in a file (that is hopefully well protected).
+
+=item (Green) Successful ssl authentication with <client>  
+
+The client (<client> is the peer's name in hosts.tab), has successfully
+negotiated an SSL connection with this child process.
+
+=item (Green) Successful insecure authentication with <client>
+                                                   
+
+The client has successfully negotiated an  insecure connection withthe child process.
+
+=item (Yellow) Attempted insecure connection disallowed    
+
+The client attempted and failed to successfully negotiate a successful insecure
+connection.  This can happen either because the variable londAllowInsecure is false
+or undefined, or becuse the child did not successfully echo back the challenge
+string.
+
+
+=back
+
+
+=cut

--foxr1223374092--