Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Priority issue
- Date: Tue, 12 Dec 2006 11:31:48 +0100
- From: Nicolas GUIOT <nicolas.guiot@xxxxxxx>
- Subject: Re: [Condor-users] Priority issue
Details (hope this can help):
On the next 74 job in the list, I have the following condor_q -better-analyze :(a 72 job has approximatly the same)
root@rhea:~# condor_q -better-analyze 74.25
-- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
---
074.025: Run analysis summary. Of 29 machines,
2 are rejected by your job's requirements
8 reject your job because of their own requirements
19 match but are serving users with a better priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
No successful match recorded.
Last failed match: Tue Dec 12 11:16:42 2006
Reason for last match failure: no match found
The Requirements expression for your job is:
( target.Arch == "INTEL" ) && ( target.OpSys == "LINUX" ) &&
( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize ) &&
( target.HasFileTransfer )
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( target.Arch == "INTEL" ) 27
2 ( target.OpSys == "LINUX" ) 29
3 ( target.Disk >= 686 ) 29
4 ( ( 1024 * target.Memory ) >= 571 )29
5 ( target.HasFileTransfer ) 29
The following attributes are missing from the job ClassAd:
CheckpointPlatform
----------------
On Tue, 12 Dec 2006 11:07:57 +0100
Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:
> Hi,
>
> I started a first job (72), which is made of about 150 queued jobs.
> Then I later started a second one (74), which I need first.
> So, once started, I modified the 74's priority with :
> condor_prio -p 500 74
> I also modified the 72's priority to -15.
>
> Now my problem is that only one of the 74 job runs and other CPUs are used by 72. Even when a 72 job finishes, if a 74 is running, it doesn't launch any new 74.
>
> Here is the submissions script (both similar) :
>
> Universe = vanilla
>
> Executable = /nfs/rhea/attract
> arguments = T27_R_M-mutate.pdb T27_L.red $(Process)
> output = /nfs/MC2/output.$(Process).txt
> error = /nfs/MC2/ERROR.$(Process)
> Log = /nfs/MC2/LOG.$(Process)
>
>
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
> transfer_input_files = T27_R_M-mutate.pdb, T27_L.red,translat.dat,attract.inp,aminon.par,rotation.dat,stan
> dard.pdb
> notify_user = user@xxxxxxxxx
> notification = error
>
> queue 147
>
>
> Here is the (truncated) condor_q result :
>
> -- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
> ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
> 72.37 saladin 12/9 17:49 0+01:29:47 R -15 701.1 attract T27_R_M-mu
> 72.38 saladin 12/9 17:49 0+00:13:23 R -15 0.6 attract T27_R_M-mu
> 72.41 saladin 12/9 17:49 0+00:12:25 R -15 0.6 attract T27_R_M-mu
> 72.42 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 72.72 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 72.73 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 72.74 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 72.145 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 72.146 saladin 12/9 17:49 0+00:00:00 I -15 0.6 attract T27_R_M-mu
> 74.24 saladin 12/11 12:43 0+00:06:35 R 500 0.6 attract T27_R_M-mu
> 74.25 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.26 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.27 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.28 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.29 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.30 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.31 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.32 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.33 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
> 74.34 saladin 12/11 12:43 0+00:00:00 I 500 0.6 attract T27_R_M-mu
>
> 190 jobs; 171 idle, 19 running, 0 held
> root@rhea:~#
>
> Thanks for any help.
> Nicolas
>
> ----------------------------------------------------
> CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
> Institut de Biologie Physico-Chimique
> 13 rue Pierre et Marie Curie
> 75005 PARIS - FRANCE
>
> Tel : +33 158 41 51 70
> Fax : +33 158 41 50 26
> ----------------------------------------------------
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>
----------
----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE
Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------