Hi Wendy! Go blue! Have you tried running it in the “local” universe instead of the scheduler universe? This is in the docs: However, are you using condor_submit_dag? The DAGman is what is intended to run in the scheduler universe, and that’s handled internally, but the job nodes within the DAG would typically run in the vanilla universe. Where in the submission
or DAG are you specifying the scheduler universe? Michael V. Pelletier From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx>
On Behalf Of wuwj@xxxxxxxxx Hi All, Our condor version is 9.0.17, and I have a DAG job, and the universe is set to be scheduler, after submitting, the job gets zero matched slots and sits in idle forever. (I would
think it will get executed on the submission node ) Here [1] is the condor_q output, I wonder if we missed any configuration which does not support scheduler universe? [2] is some configuration which might be relevant. Cheers! [1] The Requirements _expression_ for job 1283.000 is ((TARGET.TotalDisk >= 21000000 && TARGET.IsSL7WN is true && TARGET.AGLT2_SITE == "UM")) && (TARGET.Arch == "X86_64")
&& (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory)Job 1283.000 defines the following attributes: DiskUsage = 1000
[2] ALWAYS_VM_UNIV_USE_NOBODY = false DEFAULT_UNIVERSE = ENABLE_KERNEL_TUNING = true IsMPI = (TARGET.JobUniverse == $(MPI)) IsStandard = (TARGET.JobUniverse == $(STANDARD)) IsVanilla = (TARGET.JobUniverse == $(VANILLA)) IsVM = (TARGET.JobUniverse == $(VM)) KERNEL_TUNING_LOG = $(LOG)/KernelTuning.log LINUX_KERNEL_TUNING_SCRIPT = $(LIBEXEC)/linux_kernel_tuning LOCAL_UNIV_EXECUTE = $(SPOOL)/local_univ_execute LOCAL_UNIVERSE_JOB_CLEANUP_RETRY_DELAY = 30 LOCAL_UNIVERSE_MAX_JOB_CLEANUP_RETRIES = 5 SCHED_UNIV_RENICE_INCREMENT = 0 START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 200 START_SCHEDULER_UNIVERSE = 500 SYSTEM_STARTD_JOB_ATTRS = ImageSize, ExecutableSize, JobUniverse, NiceUser, CPUsUsage, ResidentSetSize, ProportionalSetSizeKb, MemoryUsage,
DiskUsage, ScratchDirFileCount SYSTEM_VALID_SPOOL_FILES = job_queue.log, job_queue.log.tmp, history, Accountant.log, Accountantnew.log, local_univ_execute, .pgpass,
.schedd_address, .schedd_address.super, .schedd_classad, OfflineLog UNICORE_GAHP = $(SBIN)/unicore_gahp VM_UNIV_NOBODY_USER = Wendy/Wenjing |