[LON-CAPA-admin] Not Open To Be Viewed Troubles

Bynum, Lee Hamilton leebynum at illinois.edu
Tue Nov 6 13:06:37 EST 2018


Hi Stuart,

This is some very helpful information, especially regarding where firstaccess time issues pop up.  It is going to make diagnosing any future issues much easier.  Fortunately, they are very rare.

> When you say you removed the local files as a precaution, do you mean you
> removed the replicated resource files and/or replicated metadata files for
> the resource?

When we see these issues we do remove those files.  I consider this to be something of an overkill, though.  As you pointed out we have not encountered the "Not Open To Be Viewed Troubles" error report.  We started to do this because of the possibility of the metadata being behind the issue.  My current leading theory doesn't support metadata as the problem, but it is still a precaution being taken.

> But perhaps by "local files" you mean something else, such as a user's course
> session files in /home/httpd/perl/tmp, removal of which would force a user
> with an expired session to re-select the course.

We remove these as well.  As above, this is to cover all possible bases.

Clearing out the local files (replicated resource files, replicated metadata files, and session files) along with a server restart has reliably stopped students' resources from becoming not open to be viewed.  I believe the server restart is what is actually doing the trick as that includes restarting Memcached.

My leading theory is that something is being corrupted in Memcached after Memcached is allowed to run too long.  Your previous emails have been extremely helpful in diagnosing Memcached related issues.  The trouble is collecting data when the isolated issues pop up without cluttering our storage with too large of logs.  Each time this pops up I am able to better define the monitoring done at five minute intervals, but I'm not quite to a solution yet.

Thanks Again,

Lee

> -----Original Message-----
> From: LON-CAPA-admin <lon-capa-admin-bounces at mail.lon-capa.org> On
> Behalf Of Raeburn, Stuart
> Sent: Friday, October 19, 2018 12:02 PM
> To: list about administration and system updating <lon-capa-
> admin at mail.lon-capa.org>
> Subject: Re: [LON-CAPA-admin] Not Open To Be Viewed Troubles
> 
> Lee,
> 
> The two additional types of behavior you report relate to resources for which
> a timed (interval) parameter applies.  In such a case, determining whether
> the resource will be available to a particular student for making submissions
> depends on parameters set for the course (or section), i.e., data from
> resourcedata.db for the course (cached on an access server for 10 minutes),
> and also data stored for the specific user, i.e., data from firstaccesstimes.db
> for the user (on the library server).
> 
> When a student accesses a resource for which a timed (interval) parameter
> applies, the scope of the interval in effect can be resource, map or course. If
> no firstaccess time is retrieved for a student for the appropriate identifier for
> that scope, then the status returned by lonhomework::check_access() will
> be "NOT_YET_VIEWED", assuming other parameters -- opendate, duedate,
> answerdate or acc -- or slot control don't prevent availability.
> 
> If status is "NOT_YET_VIEWED" then the "Show Resource" button will be
> available for the student to start the timer.
> If the button is pushed, but the firstaccess time is not recorded on the user's
> home server (i.e., your library server), then the page sent back to the
> browser will again contain the "Show Resource" button.  Repeated pushes of
> the button (with no recording of the data) will result in repeated markaccess
> entries in the student's activity log.
> 
> Note: if the result of the "put" for storage of the firstaccess time is "refused",
> then an item will be logged in /home/httpd/perl/logs/lonnet.log. If the
> failure is for some other reason -- e.g., lond::put_user_profile_entry() could
> not tie the student's firstaccesstimes hash or there was a con_lost response,
> then nothing will be logged in lonnet.log.
> 
> As regards the report from a student of receiving additional time midway
> through an exam, one legitimate way that could happen is if a Course
> Coordinator used: Settings > Content Settings >  "Reset Student Access
> Times" to reset that particular student's access time during the exam.
> 
> Another way a student could receive additional time would be if the call to
> &EXT('resource.0.interval') when accessing a resource in the exam folder did
> not retrieve the interval setting (e.g., because of a caching issue or a problem
> with data retrieval from the library server when populating the cache). Note:
> in this case this would be expected to impact other students as well.  In the
> absence of an interval parameter, the time available will be determined from
> the due date in effect.
> 
> A further way a student could receive additional time (just for
> himself/herself) would be if the previously set firstaccess time (set by the
> student, and which applied to the exam folder) was not retrieved when
> accessing a resource in the folder (and the scope of the time limit was the
> folder). In that instance, the "Show Resource" button would be displayed,
> and if the button were pushed, a new (later) firstaccess time would be set, if
> the previously set firstaccess time could not be retrieved a second time (this
> time on a different request), when processing the markaccess submission.
> The
> /home/httpd/lonUsers/uiuc/<1>/<2>/<3>/<username>/firstaccesstimes.his
> t file for the student (where <1>, <2>, and <3> are the first three characters
> of the username) would contain a record of all stored firstaccess times.
> 
> One way for LON-CAPA to be more robust against intermittent problems
> with retrieval of firstaccesstimes from a library server, would be to make use
> of $env{'course.'.$courseid.'.firstaccess.'.$res}, which is stored in the user's
> session file on the access server. Any existing firstaccess times for a user are
> added to %env as part of the user session initialization following log-in. New
> items of this type will be added to %env whenever a firstaccess time is
> successfully sent to the library server after the "Show Resource" button had
> been pushed.
> 
> Note: if use were made of these firstaccess items in %env in place of (or in
> addition to) the current call to lonnet::dump() in
> lonnet::load_all_first_access(), then once a Course Coordinator used "Reset
> Student Access Times", a student affected by that would need to logout of
> an active session, and log-in again to see th change. (The alternative would
> be to identify all nodes with active session files for an affected user, and
> delete any "course.'.$courseid.'.firstaccess.'.$res" items from each session
> file, when executing the reset for the particular $res context).
> 
> When you say you removed the local files as a precaution, do you mean you
> removed the replicated resource files and/or replicated metadata files for
> the resource?  The message which would be displayed to a user if a resource
> file was missing from the access server (and could not be replicated) would
> be: Unable to find <filename>.  I have not seen that error reported in any of
> your posts to this "Not Open To Be Viewed Troubles" thread over the past 18
> months when you have seen these occasional problems with unavailability of
> resources.  The premise has been that the issue you have seen has been
> with parameters controlling availability of the resource to the student, and
> not with access to the content of the file itself on the filesystem. Accordingly,
> I would not expect removal of copies of the resource(s) from an access
> server (which will force replication of the content) would have any bearing
> on the issue on a "Not Open To Be Viewed" problem.  One case where
> removing files might influence this would be if the resource's metadata file
> itself contained the parameters controlling availability (and you'd removed
> the .metadata file).  It would be unusual to include a parameter such as
> interval in a resource itself, but there might be use cases where it makes
> sense to do that (e.g., see: mail.lon-capa.org/pipermail/lon-capa-users/2018-
> September/005351.html prior to the availability of recursive parameters).
> 
> But perhaps by "local files" you mean something else, such as a user's course
> session files in /home/httpd/perl/tmp, removal of which would force a user
> with an expired session to re-select the course.
> 
> I have just committed lonnet.pm rev. 1.1386 which includes changes so
> failure to store a firstaccess time will be logged in more cases.  See: mail.lon-
> capa.org/pipermail/lon-capa-cvs/Week-of-Mon-20181015/028316.html
> I'll also make a change to structuretags.pm so a message will be displayed on
> screen displayed after the "Show Resource" button was pushed (and
> submitted), if the firstaccess time could not be stored.
> 
> Stuart Raeburn
> LON-CAPA Academic Consortium
> 
> ________________________________________
> From: LON-CAPA-admin <lon-capa-admin-bounces at mail.lon-capa.org> on
> behalf of Bynum, Lee Hamilton <leebynum at illinois.edu>
> Sent: Tuesday, October 16, 2018 3:13:47 PM
> To: list about administration and system updating
> Subject: Re: [LON-CAPA-admin] Not Open To Be Viewed Troubles
> 
> I have another batch of these unavailable resources.  While I dig into the new
> data points, I figure I should give an update.
> 
> It is our suspicion that removing the local files from the access server helps
> prevent this from occurring.  We had an incident on Friday on a single access
> server.  We removed the local files on that server as a precaution.  Over the
> weekend, more incidents occurred, but not on the access server we cleared
> the files from.
> 
> During the same period we had two additional behaviors that seem related
> to access server files not syncing with the library properly.  One student
> reported being given additional time on an exam midway through.  This
> behavior is less likely to be reported, so it is unclear if there are more
> incidents.  There is also a behavior in which a student repeatedly needed to
> log in to answer the same questions over again.  This is reflected in the log as
> an answer submitted but not evaluated as the access was removed.  The odd
> part of that is that it results in multiple markaccess entries.
> 
> With this newest batch we were able to demonstrate this behavior isolated
> to individual students despite other students using the same resources at
> the same time on the same server.
> 
> It is also worth noting that this is the first batch of incidents this semester.  It
> is unclear what might have changed on our servers to prompt it, but that is
> definitely something we are looking into.
> 
> In short, suddenly unavailable resources continue to pop up and we are still
> investigating.
> 
> Lee
> 
> > -----Original Message-----
> > From: lon-capa-admin-bounces at mail.lon-capa.org <lon-capa-admin-
> > bounces at mail.lon-capa.org> On Behalf Of Bynum, Lee Hamilton
> > Sent: Wednesday, February 21, 2018 9:50 AM
> > To: list about administration and system updating
> > <lon-capa-admin at mail.lon- capa.org>
> > Subject: Re: [LON-CAPA-admin] Not Open To Be Viewed Troubles
> >
> > Stuart,
> >
> > This has been very helpful for digging into what's going on.  In case
> > I'm barking up the wrong tree, though, I'd like to sanity check some
> potential state values.
> > I'm seeing values that are either single digit integers or a series of
> > integers, which I suspect come from the sequence based classes.  For
> > example, "2" or "2002202".  Are those the sort of values to be expected in
> user.state?
> >
> > Lee
> > ________________________________________
> > From: lon-capa-admin-bounces at mail.lon-capa.org [lon-capa-admin-
> > bounces at mail.lon-capa.org] on behalf of Stuart Raeburn
> > [raeburn at msu.edu]
> > Sent: Wednesday, October 04, 2017 10:28 PM
> > To: lon-capa-admin at mail.lon-capa.org
> > Subject: Re: [LON-CAPA-admin] Not Open To Be Viewed Troubles
> >
> > Lee,
> >
> > Reviewing my previous answer, I find that I provided an answer to a
> > different question than the one you had actually asked.
> >
> > Accordingly, I am going to revise my answer from: "it is in memory on
> > the access server", to it is in memory (in %env) and also in the
> > user's session file, on the filesystem on the access server.
> >
> > As you noted the username_courseid.state file in /home/httpd/perl/tmp
> > contains the conditional statement:
> >
> > &EXT('user.resource.resource.200.1.awarddetail','BuffersLab') eq
> 'APPROX_ANS'
> >
> > The user's session file (a GDBM file):
> >
> > /home/httpd/lonIDs/username_sessionnum_domain_authhost.id
> >
> > will contain a key of: user.state.uiuc_3g9196182397259fcuiuclibrary1,
> > with the corresponding value containing the results of evaluated
> conditions.
> >
> > The routine: &Apache::lonuserstate::evalstate() is used to determine
> > (and save) the value for:
> > user.state.uiuc_3g9196182397259fcuiuclibrary1, which is appended to
> > %env, and also to the session file.
> >
> > The evalstate() routine will evaluate the conditions in perl Safe
> > space, which will include calling lonnet::EXT(), which in turn calls
> > lonnet::restore to retrieve the user's submission history for the resource to
> which the alias: 'BuffersLab' points.
> >
> > The routine: &Apache::lonnet::devalidate() will remove
> > user.state.uiuc_3g9196182397259fcuiuclibrary1 from %env (and from the
> > user's session file).
> >
> > The devalidate() routine is called whenever lonnet::store() or
> > lonnet::cstore are called in course context to store the user's submissions,
> points etc.
> >
> > Note: the question I answered with my previous response to this thread
> > was where can you find values for items such as:
> > &EXT('user.resource.resource.200.1.awarddetail','BuffersLab') when a
> > problem is being rendered, and in that case the answer is in memory.
> >
> > Stuart Raeburn
> > LON-CAPA Academic Consortium
> >
> > >>
> > >> Does this live entirely in memory or am I missing something?
> > >>
> > >
> > > It is in memory on the access server.
> > >
> > > When the problem is rendered, the &initialize_storage() routine is
> > > called in structuretags.pm within start_problem() which will
> > populate
> > > the (global) %history hash in Apache::lonhomework with submission
> > and
> > > award history etc. for the current resource (in the current course
> > > context and instance for the current user) by calling
> > > lonnet::restore().
> > >
> > > See lines 1024-1025 in the version of structuretags.pm included in
> > > loncapa-2.11.2.
> > >
> > > Later during rendering, the global %history hash is cleared with
> > > undef (within the end_problem routine).
> > >
> > >
> > > Stuart Raeburn
> > > LON-CAPA Academic Consortium
> > >
> > >> I'm tracing the path of where a condition is stored against the
> > > files
> > >> on our servers.
> > >>
> > >> Condition to be met:
> > >>
> > > &EXT('user.resource.resource.200.1.awarddetail','BuffersLab') eq
> > >> 'APPROX_ANS'
> > >>
> > >>
> > >
> >
> Library1:/home/httpd/lonUsers/uiuc/u/s/e/username/uiuc_3g919618239725
> 9
> > fc
> > uiuclibrary1.db
> > >
> > >> contains:
> > >>
> > >
> >
> 2:uiuc/dmills/CHEM105/Buffers/Buffers.sequence___5___uiuc/dmills/CHE
> M1
> > 05 /Buffers/Lab.problem:resource.200.1.awarddetail
> > > =
> > >> APPROX_ANS
> > >>
> > >> I'm having a difficult time locating that condition on the access
> > servers, though.  I've found the conditional statement in the .state
> > >
> > >> file, but I'm not seeing it in any of the user files.  Does this
> > > live
> > >> entirely in memory or am I missing something?
> > >>
> > >> Thanks,
> > >>
> > >> Lee
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: lon-capa-admin-bounces at mail.lon-capa.org
> > >> [mailto:lon-capa-admin-bounces at mail.lon-capa.org] On Behalf Of
> > > Stuart
> > >> Raeburn
> > >> Sent: Tuesday, October 3, 2017 7:14 PM
> > >> To: lon-capa-admin at mail.lon-capa.org
> > >> Subject: Re: [LON-CAPA-admin] Not Open To Be Viewed Troubles
> > >>
> > >> Hello Lee,
> > >>
> > >> There is no log file of access server parameter file changes, but
> > > you
> > >> could make your own (see below).
> > >>
> > >> Looking back through the archives, it seems you ran into this issue
> > > in March.
> > >>
> > >> I posted a response to the list to that earlier post. See:
> > >>
> > >> mail.lon-capa.org/pipermail/lon-capa-admin/2017-March/003273.html
> > >>
> > >> which discussed how to view cached parameters in memcache.
> > >>
> > >> Assuming the open date parameter in effect is for all students, or
> > >> for a specific section, and not for individual student(s), then
> > >> parameters will be retrieved using lonnet::get_courseresdata().
> > That
> > >
> > >> will, in turn, call &lonnet::dump(), when the cache on the access
> > >> server has expired (it's valid for 10 minutes).
> > >>
> > >> You could modify lonnet::get_courseresdata() to write the hash
> > >> returned by dump() for a particular course to a file (e.g.,
> > >> /home/httpd/perl/tmp/debug/$coursenum on the access server).
> > >>
> > >> Your code could check if the file already exists, and if it did
> > move
> > >
> > >> the existing file to /home/httpd/perl/tmp/debug/$coursenum.old,
> > >> before writing the latest data to a new file.
> > >>
> > >> You could then create a perl script, which would be run by cron
> > > every
> > >> 5 minutes to look in /home/httpd/perl/tmp/debug/.  The script would
> > compare the contents of $coursenum.old and $coursenum (if both
> > > exist)
> > >> and record what had changed for course: $coursenum for the current
> > timestamp, in a log file.  The script would then unlink $coursenum.old.
> > >>
> > >>
> > >> Stuart Raeburn
> > >> LON-CAPA Academic Consortium
> > >>
> > >>
> > >>> Hello Admins,
> > >>>
> > >>> I am diving back into an issue we've had in which resources are
> > >>> temporarily "Not open to be viewed."  The resources should be
> > >>> available now and do become available after 10-20 minutes or a
> > >> change
> > >>> of access server.  The leading theory is that the parameter file
> > is
> > >>> being read incorrectly, resulting in the access server thinking
> > the
> > >>> resource is unavailable until the bad parameter is updated.
> > >>>
> > >>> Is there a log anywhere of access server parameter file changes?
> > >>> Because things seem to go back to normal after a while I have yet
> > > to
> > >>
> > >>> have the problem reported to me in time to look at the potentially
> > >>> problematic files in real time.
> > >>>
> > >>> Sorry for rehashing old troubles, but I'm hoping this time will to
> > >> the trick.
> > >>>
> > >>> Lee
> > >>
> > >> _______________________________________________
> > >> LON-CAPA-admin mailing list
> > >> LON-CAPA-admin at mail.lon-capa.org
> > >
> > > _______________________________________________
> > > LON-CAPA-admin mailing list
> > > LON-CAPA-admin at mail.lon-capa.org
> >
> > _______________________________________________
> > LON-CAPA-admin mailing list
> > LON-CAPA-admin at mail.lon-capa.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.lon-2Dcapa.or
> > g_mailman_listinfo_lon-2Dcapa-2Dadmin&d=DwIFAw&c=nE__W8dFE-
> shTxStwXtp0
> > A&r=VsGo3jOm8tGLd6f-KlhT-g&m=yj7qi3vKIoDz_xz3y7eeFLj7D01cnv9o-
> FO832PpA
> > oI&s=BDVMAsSGWOL_NBzr7NMUlJ6v0V2anyeIZtI3vQBDCTo&e=
> > _______________________________________________
> > LON-CAPA-admin mailing list
> > LON-CAPA-admin at mail.lon-capa.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.lon-2Dcapa.or
> > g_mailman_listinfo_lon-2Dcapa-2Dadmin&d=DwIFAw&c=nE__W8dFE-
> shTxStwXtp0
> > A&r=VsGo3jOm8tGLd6f-KlhT-g&m=yj7qi3vKIoDz_xz3y7eeFLj7D01cnv9o-
> FO832PpA
> > oI&s=BDVMAsSGWOL_NBzr7NMUlJ6v0V2anyeIZtI3vQBDCTo&e=
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.lon-
> 2Dcapa.org_mailman_listinfo_lon-2Dcapa-
> 2Dadmin&d=DwIFAw&c=nE__W8dFE-shTxStwXtp0A&r=VsGo3jOm8tGLd6f-
> KlhT-g&m=yj7qi3vKIoDz_xz3y7eeFLj7D01cnv9o-
> FO832PpAoI&s=BDVMAsSGWOL_NBzr7NMUlJ6v0V2anyeIZtI3vQBDCTo&e=
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin


More information about the LON-CAPA-admin mailing list