Robert E. Parrott wrote:
A couple of quick one-offs on configs:
1) How does a user specify a max runtime on a job from their submit
file?
What do you want to achieve: putting the job on hold if it runs for
too
long? Or simply specifying the maximum amount of time the job
should be
given to finish before being preempted by higher priority jobs?
Here, I'd like to have users be able to specify the max total run
time for a parallel job before it's ended.
Is this different from putting the job on hold if it runs too
long? I'm
not aware of any other option specific to the parallel job universe.
But I would be very interested in the answers to the other cases you
pose as well. I assume for the first you want to use a PERIODIC_HOLD
expression, but the second would be useful as well.
Yes, periodic_hold in the job submit file can be used to put a job on
hold if it runs too long. An alternative would be to have users
insert
a custom attribute that specifies maximum runtime and then you
would use
SYSTEM_PERIODIC_HOLD in the config file to put jobs on hold that run
longer than expected. Example:
in submit file:
+MaxRunTime = 3600
in config file:
SYSTEM_PERIODIC_HOLD = JobStatus == 2 && MaxRunTime =!= UNDEFINED &&
(RemoteWallClockTime - CumulativeSuspensionTime) > MaxRunTime
The other thing I alluded to was a way to specify the amount of time a
job should be allowed to run without interruption. This doesn't
really
apply to the parallel universe, because parallel universe jobs should
always run without preemption.
in submit file:
# this should finish in less than one hour
# if it does not, it is ok for it to be preempted
MaxJobRetirementTime = 3600
in execute machine config file:
# allow up to 2 days max of uninterrupted time for jobs
MaxJobRetirementTime = 3600*24*2
I hope that helps you.
--Dan
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/