>
You can get an
example here:
http://spinningmatt.wordpress.com/2011/09/16/submitting-a-dag-via-aviary-using-python/ That's pointing me in the right direction. However, if I submit a dagman job using the python API it fails to start, and I am now stuck. Here's my code: ---- 8< ---- #!/usr/bin/env python from __future__ import print_function import htcondor, classad import os, sys DAGMAN="/usr/bin/condor_dagman" dag = sys.argv[1] os.stat(dag) # test for existence schedd = htcondor.Schedd() ad = classad.ClassAd({ "JobUniverse": 7, "Cmd": DAGMAN, "Arguments": "-f -l . -Lockfile %s.lock -AutoRescue 1 -DoRescueFrom 0 " \ "-Dag %s -Suppress_notification -Force -Dagman %s" % (dag, dag, DAGMAN), "Env": "_CONDOR_MAX_DAGMAN_LOG=0;_CONDOR_DAGMAN_LOG=%s.dagman.out;" \ "_CONDOR_SCHEDD_DAEMON_AD_FILE=%s;_CONDOR_SCHEDD_ADDRESS_FILE=%s" % (dag, htcondor.param["SCHEDD_DAEMON_AD_FILE"], htcondor.param["SCHEDD_ADDRESS_FILE"]), "EnvDelim": ";", "Out": "%s.lib.out" % dag, "Err": "%s.lib.err" % dag, "ShouldTransferFiles": "IF_NEEDED", "UserLog": os.path.abspath("%s.dagman.log" % dag), "KillSig": "SIGTERM", "RemoveKillSig": "SIGUSR1", #"OtherJobRemoveRequirements": classad.ExprTree('eval(strcat("DAGManJobId == ", ClusterId))'), "OnExitRemove": classad.ExprTree('( ExitSignal =?= 11 || ( ExitCode =!= undefined && ExitCode >= 0 && ExitCode <= 2 ) )'), "FileSystemDomain": htcondor.param['FILESYSTEM_DOMAIN'], #"TransferIn": classad.ExprTree('false'), #"TransferInputSizeMB": 0, }) cluster = schedd.submit(ad) print("Submitted as cluster %d" % cluster) ---- 8< ---- This happily submits a job, but it sits in the queue in Idle (I) state indefinitely. /var/log/condor/SchedLog shows: 12/17/13 16:16:56 (pid:20910) The Requirements attribute for job 528436.0 did not evaluate. Unable to start job condor_q -analyze shows: ---- 8< ---- 528436.000: Request has not yet been considered by the matchmaker. User priority for brian@xxxxxxxxxxx is not available, attempting to analyze without it. --- 528436.000: Run analysis summary. Of 12 machines, 0 are rejected by your job's requirements 0 reject your job because of their own requirements 0 match and are already running your jobs 0 match but are serving other users 12 are available to run your job WARNING: Analysis is meaningless for Scheduler universe jobs. ---- 8< ---- condor_q -long shows: Requirements = true && TARGET.OPSYS == "LINUX" && TARGET.ARCH == "X86_64" && ( TARGET.HasFileTransfer || ( TARGET.FileSystemDomain == MY.FileSystemDomain ) ) && TARGET.Disk >= RequestDisk && TARGET.Memory >= RequestMemory and related attributes: RequestDisk = DiskUsage DiskUsage = 1 RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,( ImageSize + 1023 ) / 1024) ImageSize = 100 This requirements _expression_ is slightly different to what I get if I submit the job using condor_submit: then I get Requirements = ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) If I try setting, for example, "Requirements": classad.ExprTree("wombat"), then it becomes Requirements = wombat && TARGET.OPSYS == "LINUX" && TARGET.ARCH == "X86_64" && ( TARGET.HasFileTransfer || ( TARGET.FileSystemDomain == MY.FileSystemDomain ) ) && TARGET.Disk >= RequestDisk && TARGET.Memory >= RequestMemory so it looks like the remainder of this _expression_ is being set by condor at submission time. But I don't know why a job submitted via the python API should have a different requirements _expression_ - and in any case, I can't tell if this _expression_ is failing, or there's some other reason. I have also tried: "Requirements": classad.ExprTree('( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )'), but this this case it still gets && ( TARGET.HasFileTransfer || ( TARGET.FileSystemDomain == MY.FileSystemDomain ) ) appended when I look at condor_q -long. Clues gratefully received. I am using condor 8.0.4-189770 under Ubuntu 12.04 x86_64. Thanks, Brian. |