[LON-CAPA-dev] Occasional strangeness

Martin Siegert lon-capa-dev@mail.lon-capa.org
Fri, 4 Oct 2002 11:55:25 -0700


Hello,

this sounds like running out of shared memory segments and/or 
semaphores. If a program crashes, it leaves behind the shared
memory segments and semaphores that it used.
The following two scripts will get rid of them:

===<delete_semaphores.pl>=====================================================
#!/usr/bin/perl
#
# delete_semaphores.pl: delete leftover semaphores from crashed MPI programs
#
$IPCRM="/usr/bin/ipcrm";
$IPCS="/usr/bin/ipcs";
@SEMAPHORES=`$IPCS -s`;

for ($i = 3; $i < $#SEMAPHORES; $i++) {
  @line = split(' ',$SEMAPHORES[$i]);
  $semid = $line[1];
# print "$semid\n";
  system("$IPCRM sem $semid");
}
==============================================================================

===<delete_shmemsegs.pl>======================================================
#!/usr/bin/perl
#
# delete_shmemsegs.pl: delete leftover shared memory segments from crashed
# MPI programs
#
$IPCRM="/usr/bin/ipcrm";
$IPCS="/usr/bin/ipcs";
@SHMEMSEGS=`$IPCS -m`;

for ($i = 3; $i < $#SHMEMSEGS; $i++) {
  @line = split(' ',$SHMEMSEGS[$i]);
  $shmid = $line[1];
# print "$shmid\n";
  system("$IPCRM shm $shmid");
}
==============================================================================

I hope this helps.

Cheers,
Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert@sfu.ca
Canada  V5A 1S6
========================================================================

On Fri, Oct 04, 2002 at 02:44:16PM -0400, Matthew Brian Hall wrote:
> Hello all.
> 
> Both Jeremey & Alex have had trouble starting their http daemons over the last
> few days.  I saw the error on Alex's system - something about binding a
> semaphore or memory management issues.  At any rate, if anyone sees anything
> like this again we should track it down.
> 
> Of course, the exact error message would be a starting point :)
> 
> Matthew
> 
> --
> ------------------------------------------------------------------
> Matthew Hall           LON-CAPA developer         hallmat3@msu.edu
> 123 North Kedzie Hall                    Michigan State University
> ------------------------------------------------------------------
> 
> _______________________________________________
> LON-CAPA-dev mailing list
> LON-CAPA-dev@mail.lon-capa.org
> http://mail.lon-capa.org/mailman/listinfo/lon-capa-dev