[LON-CAPA-admin] lonmemcached_errors
Abdel Messeh, Maged
mmesseh at illinois.edu
Wed Feb 11 12:58:39 EST 2015
Hi Stuart,
I have used your script to and managed to dump lots of information from memcache. I am not sure though what to look for?
However, I used memcached-tool to monitor some stats about the connection as I am wondering if the system is putting an upper limit on the number of open sockets. My ulimit is set to 1024?
I am currently collecting:
Memcached Current Connections
Memcached Evictions
Memcached Get_misses
Memcached Get_hits
Memcached Total Connections
So I can see how they behave when/if the problem happens again.
Thanks much,
Maged
-----Original Message-----
From: lon-capa-admin-bounces at mail.lon-capa.org [mailto:lon-capa-admin-bounces at mail.lon-capa.org] On Behalf Of Stuart Raeburn
Sent: Monday, February 9, 2015 8:38 AM
To: lon-capa-admin at mail.lon-capa.org
Subject: Re: [LON-CAPA-admin] lonmemcached_errors
Hi Maged,
> Failed to write, and not due to blocking: Broken pipe
I have not encountered that error myself.
Looking at the source code for memcached from:
https://github.com/memcached/memcached
it appears that the error you saw originates in the function:
enum transmit_result transmit()
in memcached.c, and occurs when sendmsg() used to send a message on a
socket returns 0 or -1, i.e.,no characters sent were successfully sent.
In this instance the result is "TRANSMIT_HARD_ERROR" and the
connection state is set to "conn_closing".
From the command line:
ps aux |grep memcached |grep -v grep
will display information about the memcached process.
netstat |grep localhost:memcache
will show information about connections to memcache.
If memcached is not running (but other LON-CAPA daemons are running)
it can be started using /etc/init.d/loncontrol start
The global memcache configuration is in:
/etc/sysconfig/memcached
The command line script: memcached-tool can be used to will display
statistics from a running memcached instance. See: man memcached-tool.
e.g.,
memcached-tool 127.0.0.1:11211 display
memcached-tool 127.0.0.1:11211 stats
There is also: memcached-tool 127.0.0.1:11211 dump
which makes a partial dump of the cache. However, a custom script is
probably a better way to dump the current contents of memcached. (see
below).
LON-CAPA itself uses the Cache:Memcached perl module, see:
http://search.cpan.org/~dormando/Cache-Memcached-1.30/lib/Cache/Memcached.pm
According to the Cache:Memcached documentation the connect_timeout
defaults to .25 second, and the select_timeout defaults to 1 second.
The constructor used by LON-CAPA in /home/httpd/lib/perl/Apache/lonnet.pm is:
$memcache=new Cache::Memcached({'servers' => ['127.0.0.1:11211'],
'compress_threshold'=> 20_000,
});
Ubuntu 12.04 LTS includes libcache-memcached-perl 1.29-1, and as you
noted, version 1.4.13-0ubuntu2.1 of memcached itself.
The perl script below uses Cache::Memcached and Data::Dumper to dump
memcached's currently stored keys and values. However, if your
server/VM is experienceing load problems, running this will script
will likely exacerbate that.
I recommend sending the output from this script to a file, e.g.,
perl memcached_dump.pl > memcache_snapshot.txt
******
#!/usr/bin/perl
use Cache::Memcached;
use Data::Dumper;
# memcached_dump.pl
# February 9, 2015
use strict;
my $instance = "127.0.0.1:11211";
my $memd = new Cache::Memcached {
'servers' => [ $instance],
'debug' => 0,
};
my %containers;
my $stats = $memd->stats('items');
my $items = $stats->{hosts}->{$instance}->{items};
foreach my $line (split(/\r\n/,$items)) {
$line =~ s/^.*:(.*):.*$/$1/ig;
$containers{$line} = 1;
}
foreach my $container (sort(keys(%containers))) {
my $result = $memd->stats("cachedump $container 0");
my $contents = $result->{hosts}->{$instance}->{"cachedump $container 0"};
foreach my $item (split(/\r\n/,$contents)) {
my ($name,$size) = ($item =~ /^ITEM\s+(\S+)\s+\[([^;]+)/);
$item =~ s/^ITEM (.*) \[.*$/$1/ig;
my $val = $memd->get($item);
print "$name $size ".Dumper($val)."\n";
}
}
$memd->disconnect_all;
*****
Stuart Raeburn
LON-CAPA Academic Consortium
Quoting "Abdel Messeh, Maged" <mmesseh at illinois.edu>:
> Hi All,
>
> Recently I have seen an error showing in lonmemcached_errors, the
> error message which repeats itself:
>
> Failed to write, and not due to blocking: Broken pipe
>
> At the same time I see a very high disk IO, and increased number of
> apache processes. This happened on 2 of my access nodes.
> This also was associated with many instances of Code Ran Too Long.
>
> I am currently running Ubuntu 12 with memcached 1.4.13. My access
> nodes have 6G of memory, during this time the free amount was around
> 0.5G-1G
>
> Any thoughts on why I am seeing this? And what I can do about it?
>
> Many thanks,
>
> Maged
> _______________________________________________
> LON-CAPA-admin mailing list
> LON-CAPA-admin at mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
_______________________________________________
LON-CAPA-admin mailing list
LON-CAPA-admin at mail.lon-capa.org
http://mail.lon-capa.org/mailman/listinfo/lon-capa-admin
More information about the LON-CAPA-admin
mailing list