Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Leftovers of checkpointed jobs accumulate in SPOOL
- Date: Wed, 7 Mar 2012 08:57:38 +0100
- From: Michael Hanke <michael.hanke@xxxxxxxxx>
- Subject: [Condor-users] Leftovers of checkpointed jobs accumulate in SPOOL
Hi,
I'm testing DMTCP-based checkpointing of vanilla job in our Condor pool
(all version 7.7.5). I noticed that jobs once evicted remain in SPOOL
even after they got restarted on an exec node again. Checkpoint files,
executable, restart script and various other files remain -- I assume
that is just everything.
Eventually condor_preen would remove most of it, e.g.
/var/spool/condor/2030/0/cluster2030.proc0.subproc0 - Removed
/var/spool/condor/2027/0/cluster2027.proc0.subproc0 - Removed
However, even after the preen run
/var/spool/condor/2030/0
/var/spool/condor/2027/0
remain as empty directories.
Could this be a configuration issue? Does DMTCP-based checkpointing need
additional setup? I'm using the latest available Condor-DMTCP
integration.
Thanks in advance,
Michael
--
Michael Hanke
http://mih.voxindeserto.de