[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor configuration question



Thanks for the reply Steve. Here is another question:

I want to preempt a job. And here is the configuration I have at both the master/submit machine and in the execute machine:

[bala@node2 condor-7.1.4]$ condor_config_val WANT_VACATE
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val PREEMPTION_REQUIREMENTS
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val PREEMPT
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val MaxJobRetirementTime
10

I did a condor_reconfig on all the machines. With this configuration in place, I was expecting every job be preempted 10 seconds after it starts and would have a 10 sec to do clean up and be killed. But the jobs that I submit which usually runs for 3 mins runs for around 30 mins (a lot of times I see the job in the Idle state) and gets completed which is not expected.

Any idea on whats wrong with the configuration? I would like condor to kill my jobs in 20 seconds.

Thanks.
.Bala.

Steven Timm wrote:
2 ways to do it
a) here is a preemption requirements statement much like
the UWCS default one.

[root@fcdf2x1 ~]# condor_config_val PREEMPTION_REQUIREMENTS
(((CurrentTime - EnteredCurrentState) > (1 * (10 * 60)) && RemoteUserPrio
SubmittorPrio * 1.2) && RemoteUser =!= "cdf@xxxxxxxx" && RemoteUser =!=
"cdffgrid@xxxxxxxx" && RemoteUser =!= "cdfnam@xxxxxxxx" && RemoteUser =!= "cdfdev@xxxxxxxx"

All you have to do is to up the timestamp from more than 600 seconds,
as above, to however much time you want in seconds.

Second thing you can do is to use nonzero maxjobretirementtime
so things will still pre-empt but it will still have maxjobretirementtime
seconds to finish the job.

For both of the scenarios above machine RANK should be set to zero.

Steve Timm



On Tue, 6 Jan 2009, Balamurali Ananthan wrote:

Greetings!

Wondering if it is possible to configure condor in such a way that, a remote
user's job should not be preempted before a certain time is elapsed.

For example, userx submits a job that runs for more-or-less 10 hours. I want
to configure condor in such a way that the job once started on an execute
machine, should not be disturbed for 11 hours.

If this is possible, could someone please point me to the right
documentation.

Thanks much!
.Bala.

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




--
Balamurali Ananthan (bala@xxxxxxxxxx) (720.974.1843)	
Tech-X Corp, 5621 Arapahoe Ave, Suite A, Boulder, CO 80303