[LON-CAPA-dev] lond..non-preforking version.
Wed, 15 Jan 2003 07:59:33 -0500
-----BEGIN PGP SIGNED MESSAGE-----
I have committed a non-preforking version of lond.. very
preliminary test version. Much stuff needs to be cleaned up as well
as tested. It's currently running on lonkashy (otherwise known as
nscll1). I would appreciate any bashing people can do against this
A little history about why non-preforking:
Preforking daemons are generally used and useful when the
connection rate is high. The idea is to amortize process creation
overhead against several connections, and therefore turn it into a
small effect in the total connection timing. In LonCAPA, as we all
know, lonc is there to maintain a (mostly) persistent connection.
This means that the time averaged connection rate is in rough numbers
Ok, so that may mean that preforking was an unnecessary design
choice, but why fix it if it wasn't broken? Two reasons:
- - It is broken. There are evidently errors in the maintenance of the
pre-forked servers as evidenced by Guy's observation that the ELHS
lond at some
point got down to a single child and refused to spawn any more
The maintenance of a child population, while seemingly simple, has
a few subtle
issues related to signals and how (un)reliable they may be under
circumstances. Fixing the problem is probably harder than just
removing the issue.
- - The work I have next involves allowing removing request
serialization that occurs
on the single lonc/lond connection that each system with
To do this will require at some point that the lond not know in
many children it will spawn off a-priori. Modification to a
along those lines are possible, but are even more complex than
- - I like simple designs where possible, they are more understandable,
maintainable, more likely to work, complexity can always come later
(ok that's three reasons).
A simple rundown of the changes required:
- - lond's main loop of sleeping and forking children into the prefork
- - the main loop now consists of accepting connections and passing
them to a
thinly modified make_child.
- - the thinly modified make_child forks, and captures the child
before so child exits can be logged.
- - The child process is lightly modified: Instead of accepting
and then validating them, it validates the one single connection it
handed by the parent and does transactions along that connection
peer exits, at which time it too exits... resulting in logs (after
all at this
time lonc's hold completely persistent and completely reliable
As long as the parent process runs, new children will be created on
demand with no-limitation. Note that this implies, technically, a
hole for a DOS attack:
If I want to bring a Lon-CAPA server down all I really need to do is
write a program/script that keeps making connections to it on the
lond port, and holds them without sending any data. Since the
challenge response sequence has no timeout associated with it, each
lond will stall. Eventually I'll use up either the number of sockets
or the number of processes the system is allowed to create and the
system as a whole will stall. Note that this attack requires must
either come from a host that is in the hosts.tab file or
alternatively from a system that is spoofing that host's ip.
A remedy to consider for later would be:
- - Put a timeout on the challenge/response dialog.
- - Log timeouts globally.
- - Feed back the timeout information to lond the master.
- - Refuse connections from a host that has timed out more than x
number of times
in a row without success for some long time.
Since in theory there's already a lond/lonc connection that's
legitimate, this fix does not allow a node-node DOS attack to
succeed...esp. if the rewritten lonc code only times out and closes
>additional< connections, and keeps at least one connection always
alive... trade -off here against scalability....hmph.
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.3 for non-commercial use <http://www.pgp.com>
-----END PGP SIGNATURE-----