>
You can get an
example here:
http://spinningmatt.wordpress.com/2011/09/16/submitting-a-dag-via-aviary-using-python/
That's pointing me in the right direction. However, if I submit a
dagman job using the python API it fails to start, and I am now
stuck.
Here's my code:
---- 8< ----
#!/usr/bin/env python
from __future__ import print_function
import htcondor, classad
import os, sys
DAGMAN="/usr/bin/condor_dagman"
dag = sys.argv[1]
os.stat(dag) # test for existence
schedd = htcondor.Schedd()
ad = classad.ClassAd({
"JobUniverse": 7,
"Cmd": DAGMAN,
"Arguments": "-f -l . -Lockfile %s.lock -AutoRescue 1
-DoRescueFrom 0 " \
"-Dag %s -Suppress_notification -Force -Dagman %s" %
(dag, dag, DAGMAN),
"Env":
"_CONDOR_MAX_DAGMAN_LOG=0;_CONDOR_DAGMAN_LOG=%s.dagman.out;" \
"_CONDOR_SCHEDD_DAEMON_AD_FILE=%s;_CONDOR_SCHEDD_ADDRESS_FILE=%s"
%
(dag, htcondor.param["SCHEDD_DAEMON_AD_FILE"],
htcondor.param["SCHEDD_ADDRESS_FILE"]),
"EnvDelim": ";",
"Out": "%s.lib.out" % dag,
"Err": "%s.lib.err" % dag,
"ShouldTransferFiles": "IF_NEEDED",
"UserLog": os.path.abspath("%s.dagman.log" % dag),
"KillSig": "SIGTERM",
"RemoveKillSig": "SIGUSR1",
#"OtherJobRemoveRequirements":
classad.ExprTree('eval(strcat("DAGManJobId == ", ClusterId))'),
"OnExitRemove": classad.ExprTree('( ExitSignal =?= 11 ||
( ExitCode =!= undefined && ExitCode >= 0 &&
ExitCode <= 2 ) )'),
"FileSystemDomain": htcondor.param['FILESYSTEM_DOMAIN'],
#"TransferIn": classad.ExprTree('false'),
#"TransferInputSizeMB": 0,
})
cluster = schedd.submit(ad)
print("Submitted as cluster %d" % cluster)
---- 8< ----
This happily submits a job, but it sits in the queue in Idle (I)
state indefinitely.
/var/log/condor/SchedLog shows:
12/17/13 16:16:56 (pid:20910) The Requirements attribute for job
528436.0 did not evaluate. Unable to start job
condor_q -analyze shows:
---- 8< ----
528436.000: Request has not yet been considered by the matchmaker.
User priority for
brian@xxxxxxxxxxx is not available, attempting to
analyze without it.
---
528436.000: Run analysis summary. Of 12 machines,
0 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match and are already running your jobs
0 match but are serving other users
12 are available to run your job
WARNING: Analysis is meaningless for Scheduler universe jobs.
---- 8< ----
condor_q -long shows:
Requirements = true && TARGET.OPSYS == "LINUX" &&
TARGET.ARCH == "X86_64" && ( TARGET.HasFileTransfer || (
TARGET.FileSystemDomain == MY.FileSystemDomain ) ) &&
TARGET.Disk >= RequestDisk && TARGET.Memory >=
RequestMemory
and related attributes:
RequestDisk = DiskUsage
DiskUsage = 1
RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(
ImageSize + 1023 ) / 1024)
ImageSize = 100
This requirements _expression_ is slightly different to what I get if
I submit the job using condor_submit: then I get
Requirements = ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys
== "LINUX" ) && ( TARGET.Disk >= RequestDisk ) &&
( TARGET.Memory >= RequestMemory )
If I try setting, for example, "Requirements":
classad.ExprTree("wombat"), then it becomes
Requirements = wombat && TARGET.OPSYS == "LINUX" &&
TARGET.ARCH == "X86_64" && ( TARGET.HasFileTransfer || (
TARGET.FileSystemDomain == MY.FileSystemDomain ) ) &&
TARGET.Disk >= RequestDisk && TARGET.Memory >=
RequestMemory
so it looks like the remainder of this _expression_ is being set by
condor at submission time. But I don't know why a job submitted via
the python API should have a different requirements _expression_ - and
in any case, I can't tell if this _expression_ is failing, or there's
some other reason.
I have also tried:
"Requirements": classad.ExprTree('( TARGET.Arch == "X86_64" )
&& ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk
>= RequestDisk ) && ( TARGET.Memory >= RequestMemory
)'),
but this this case it still gets
&& ( TARGET.HasFileTransfer || ( TARGET.FileSystemDomain ==
MY.FileSystemDomain ) )
appended when I look at condor_q -long.
Clues gratefully received. I am using condor 8.0.4-189770 under
Ubuntu 12.04 x86_64.
Thanks,
Brian.
_______________________________________________