Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] How to hold/Release all dag jobs when hold/release dagman job?
- Date: Wed, 21 Aug 2013 11:45:17 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] How to hold/Release all dag jobs when hold/release dagman job?
On Wed, 21 Aug 2013, 钱晓明 wrote:
I know all dag jobs can be reomved when I condor_rm dagman job, but
hold/release is not the case.
How can I make all jobs held/released according to dagman job status? I
think I should add something in my job submit file.
It's not too hard (assuming you don't have nested DAGs). You do two
condor_hold commands -- one to hold the DAGMan job itself, and one to hold
the node jobs.
Here's an example:
manta(222)% condor_q
-- Submitter: wenger@xxxxxxxxxxxxxxxxx : <128.105.14.228:51653> :
manta.cs.wisc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
318.0 wenger 8/21 11:42 0+00:00:29 R 0 1.7 condor_dagman
320.0 wenger 8/21 11:43 0+00:00:03 R 10 0.0
job_dagman_node_pr
2 jobs; 0 completed, 0 removed, 0 idle, 2 running, 0 held, 0 suspended
manta(223)% condor_hold 318
All jobs in cluster 318 have been held
manta(224)% condor_hold -constraint "DAGManJobId==318"
All jobs matching constraint (DAGManJobId==318) have been held
manta(225)% condor_q
-- Submitter: wenger@xxxxxxxxxxxxxxxxx : <128.105.14.228:51653> :
manta.cs.wisc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
318.0 wenger 8/21 11:42 0+00:00:42 H 0 1.7 condor_dagman
320.0 wenger 8/21 11:43 0+00:00:30 H 10 0.0
job_dagman_node_pr
2 jobs; 0 completed, 0 removed, 0 idle, 0 running, 2 held, 0 suspended
manta(226)%
If you have sub-DAGs, you'll have to do the condor_hold with the
constraint for each sub-DAG.
I'm thinking that we should create a command that does this automatically,
including handling sub-DAGs...
Kent Wenger
CHTC Team