Date: | Fri, 04 Feb 2005 12:03:50 -0800 |
---|---|
From: | "Michael S. Root" <mike@xxxxxxxxxxxxxx> |
Subject: | [Condor-users] Dagman & Job Priorities |
Hi everyone. I've looked through the Condor documentation and mailing list archive for this, but haven't found what I'm looking for yet. The problem is this: It is frequently the case for us that a single user is running multiple DAGman jobs. The behavior we get is that all jobs from a user's dags get run concurrently (within the user's resource limits), such that the dags all finish at roughly the same time. It is sometimes the case that a dag with just one job left will sit in the queue for hours waiting for other of the same user's dags (with more unfinished jobs) to 'catch up'. What we would like is to have all the jobs from the first dag submitted to finish first, then the second, etc... Since the dags are not necessarily related in terms of what they're processing and usually aren't submitted at the same time, it doesn't make sense to have one dag depend on another. I thought about setting the machine RANK expression to "( -1 * DAGManJobId )", thus the lowest numbered DAGman jobs would be preferred. I haven't tried it yet, though, because I'm not sure if this expression would apply before or after the machine has been matched to a user. Would a job with low user-priority and a low-numbered DAGManJobID get priority over another job with a higher user-priority, but a higher-numbered DAGManJobID? Even better would be if there were a way to look at the job priority of DAGman itself and have sub-jobs get chosen based on that. I have noticed that changing a DAGman job's priority doesn't have any affect on it's children. It wouldn't be hard to write a script to change the priority of all a DAG's children, but it would have to be run repeatedly each time DAGman submits more jobs into the queue (we often run with a -maxjobs limit). Anyone have any clever suggestions on how to implement something like this? -Mike |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | RE: [Condor-users] What would cause a schedd to stop responding tocondor_q queries?, Ian Chesal |
---|---|
Next by Date: | Re: [Condor-users] Kerberos on Tru64, Zachary Miller |
Previous by Thread: | [Condor-users] DAG - POST Script problem, Brian Gyss |
Next by Thread: | Re: [Condor-users] Dagman & Job Priorities, Michael S. Root |
Indexes: | [Date] [Thread] |