Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Marking child as DONE
Hi Nick,
Rather than setting 'executable = /bin/true' you could add to
the submit file 'hold = True'. The child jobs will then be submitted
and held and will not run unless you explicitly call
condor_release on them.
In a similar way you could set 'noop_job = True' for the child
jobs and the jobs will simply be marked as completed with a
return value of 0.
Scott
> Dear all,
>
> After a DAG has run partway through, I've decided that the bottom-most
> post-processing job (several thousand of them) should/can not be run.
> When my rescue DAG comes, as it inevitably does, I would like not to
> execute these. So far, no problem; a one-line bash/sed invocation
> takes care of that:
>
> cat $f | sed 's/.*mysubfile.*/& DONE/' > ${f}.sires_done;
>
> The problem is that not all of the parents have completed
> successfully. I'd like to resubmit the parents, but not these
> children. When I naively mark them as DONE, as above, I get the
> following error while dagman parses the DAG.
>
> 3/13 20:25:13 ERROR: AddParent( ea0bca7d3503cccca43dff66a99c1516 )
> failed for no
> de a5bf08f49f3323fdd5f838f6d89918f7: STATUS_DONE child may not be
> given a n
> ew STATUS_READY parent
>
> Removing the JOB lines produces an error that the parent-child
> relationships refer to a non-existent job. (I don't have the exact
> message handy.)
>
> I see a few solutions, none of which I like:
> * resubmit without modification and let the children fail (wastes
> resources)
> * change the submit files to point to /bin/true and run in the local
> universe (a lot of scheduling overhead, I'd think, but maybe this is
> negligible)
> * identify all nodes of a class and remove all references to each of
> them (more code than I want to write at the moment)
>
> Can I get some gut reactions to these options or perhaps new, cleverer
> options?
>
> Thanks,
> Nick
>
> ===================================
> Nickolas Fotopoulos
> nvf@xxxxxxxxxxxxxxxxxxxx
>
> Office: (414) 229-6438
> Fax: (414) 229-5589
> University of Wisconsin - Milwaukee
> Physics Bldg, Rm 471
> ===================================
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/