Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Baby steps toward Parallel Universe
- Date: Mon, 11 May 2009 15:50:45 -0400
- From: "Jonathan D. Proulx" <jon@xxxxxxxxxxxxx>
- Subject: [Condor-users] Baby steps toward Parallel Universe
Hi All,
I'm trying to get parallel Unverse jobs to run under 7.2 for staters
I'm just trying to sleep:
Universe = parallel
# only send email if there's an error
Notification = Error
Executable = /bin/sleep
Arguments = 30
machine_count = 4
queue
I've configured an number of execute nodes with:
condor_status -constraint 'DedicatedScheduler == "DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxxxx"' -total
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 87 0 31 54 2 0 0
Total 87 0 31 54 2 0 0
on theses systems START is set to TRUE and all the preempt and
suspend related macros have '&& ( MY.NiceUser == True)' at the end so
only NiceUser jobs get interrupted.
The jobs just site Idle in the queue despite the unclaimed systems
shown above and -analyze shows:
-- Submitter: borg-login-1.csail.mit.edu : <128.30.112.26:58458> : borg-login-1.csail.mit.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
---
32234.000: Run analysis summary. Of 263 machines,
26 are rejected by your job's requirements
17 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
159 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
61 are available to run your job
Any clues where I'm going off into the weeds on this one?
Thanks,
-Jon