Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Disabling the restarting of jobs
- Date: Thu, 28 Feb 2008 20:37:42 -0500
- From: "Robert E. Parrott" <parrott@xxxxxxxxxxxxxxxx>
- Subject: [Condor-users] Disabling the restarting of jobs
Hi Folks,
We have a certain set of users whose parallel code is fairly strict
in the way it names and manages files. They are looking for an option
to disable the restarting of jobs after the jobs have been damaged
(for example by an OOM condition on a node), because a restarted job
will otherwise overwrite the partial, and still usable, data files.
Is there a submit file expression, of a config expression, with a
boolean to be added to the config file, that would put all jobs that
would otherwise restart into a "hold" state?
As a secondary question, would there be a way to update the
"HoldReason" classad expression with something relevant?
thanks,
rob
==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin 211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045