Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] properly removing/stopping a dag and all its nodes?
- Date: Tue, 5 Jul 2011 12:53:04 -0400
- From: "Rowe, Thomas" <rowet@xxxxxxxxxx>
- Subject: Re: [Condor-users] properly removing/stopping a dag and all its nodes?
> I had assumed that issuing "condor_rm 267", where 267 is the cluster
of a condor_dagman.exe job, would cleanly terminate all outstanding
nodes of the DAG. Instead there a bunch of
> jobs left according to condor_q and I have to use -forcex to remove
them. Also, condor_status indicates many "State: Claimed; Activity:
Idle" slots. I have to "condor_restart -all" to clean
> them up.
OK, setting "UWCS_CLAIM_WORKLIFE = 0" makes the cancelled nodes abandon
slots right away. But I get loads of nodes stuck in the 'X' state and
the corresponding condor_shadow processes never exit. I have to manually
kill the condor_shadow processes.
What am I doing wrong?