Thanks Ian and Bradley, I understand the way what you have suggested. That is really great.
Still i have some doubts, I dont want to use the MaxRunHours parameters in submit file, coz that gives the control to users. I want this control to be with condor.
So the other option suits me to specify SYSTEM_PERIODIC_REMOVE in configuration file on submit machine. But with this what i understand is that if i set it to 1 hour then the user whose jobs takes more than 1 hour always get killed, thats why i want to
have different queues if possible. So that control lies fully with condor.
Coz in my case what is happening is sometime job remains in Running status but actually its not running, this i had discussed in some of my early posts.
To condor job is in running status, so the resources which are used by these jobs gets blocked, hence the idle jobs remains in idle position.
So is there any way to resolve this issue. I dont know why it happens that
job continues in running states even for several day, which should actually get complete in 8-9 hours.
One more thing i would like to know that is there any way to specify some time parameter in submit file may be MaxRunHours after that, the
particular job get resubmitted automatically. I hope something for this must be there in condor which i am missing, or if some helping scripts are there which can do this work of checking the job status depending upon MaxRunHours
and then resubmit the job. It will be very helpful to me.
Thanks and with regards,
On Fri, Jan 27, 2012 at 10:56 PM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:
Hi Raman,
I'll expand a little on what Ian said.
The Condor way of doing this sort of thing is to add a ClassAd
attribute to the job. For example, in the submit file, the user
could put the following:
+MaxRunHours = 8
If you want to automatically remove jobs that run for longer than
the specified max runtime, you can put the following in the Condor
configuration on the submit machine:
The above _expression_ assumes that MaxRunHours is always defined. A
slightly more complicated _expression_ could supply a default value of
1 hour if MaxRunHours is undefined by the user:
There are more things you might want to configure based on
MaxRunHours (e.g. preemption policy), but the above should implement
the basic policy of a strict upper bound on job runtime.
--Dan
On 1/27/12 7:11 AM, Ian Chesal wrote:
On Friday, 27
January, 2012 at 3:33 AM, Raman Sehgal wrote:
Hello all,
I was wondering if it is possible to have multiple job
queues on same machine.
So that user can submit job to the queue depending upon
requirement.
For example
Some user is having long jobs say it runs for 8 hours,
on the other hand some users are having short jobs
that runs for 1 hour only.
So if possible can i have two job queue namely "one
hour" and "10 hour".
So that the user of short job submit it to "one hour"
queue and
long job users submit to "10 hour" queue.
If the job execution time exceeds the time allocated to
job queue, then the jobs should
either be resubmitted or killed.
This sort of thing is generally unnecessary with Condor. You
can set a per-submission rule that can be used to terminate the
jobs in the submission if they violate some policy you want to
put in place.