[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [External] schduler universe job won't start on the submisson machine



Hi Wendy! Go blue! Have you tried running it in the “local” universe instead of the scheduler universe? This is in the docs:

However, unlike the local universe, the scheduler universe does not use a condor_starter daemon to manage the job, and thus offers limited features and policy support. The local universe is a better choice for most jobs which must run on the submit host, as it offers a richer set of job management features, and is more consistent with other universes such as the vanilla universe. The scheduler universe may be retired in the future, in favor of the newer local universe.

 

However, are you using condor_submit_dag? The DAGman is what is intended to run in the scheduler universe, and that’s handled internally, but the job nodes within the DAG would typically run in the vanilla universe. Where in the submission or DAG are you specifying the scheduler universe?

 

Michael V. Pelletier
Digital Technology
HPC Support Team
Raytheon Missiles and Defense

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of wuwj@xxxxxxxxx
Sent: Thursday, January 12, 2023 12:33 PM
To: htcondor-users <htcondor-users@xxxxxxxxxxx>
Subject: [External] [HTCondor-users] schduler universe job won't start on the submisson machine

 

Hi All, 

Our condor version is 9.0.17, and I have a DAG job, and the universe is set to be scheduler, after submitting, the job gets zero matched slots and sits in idle forever. (I would think it will get executed on the submission node )

Here [1] is the condor_q output, I wonder if we missed any configuration which does not support scheduler universe? [2] is some configuration which might be relevant. 

 

 

Cheers!

 

 

 

[1]

The Requirements _expression_ for job 1283.000 is    ((TARGET.TotalDisk >= 21000000 && TARGET.IsSL7WN is true && TARGET.AGLT2_SITE == "UM")) && (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory)Job 1283.000 defines the following attributes:    DiskUsage = 1000
    RequestDisk = DiskUsage
    RequestMemory = 3200The Requirements _expression_ for job 1283.000 reduces to these conditions:         Slots
Step    Matched  Condition
-----  --------  ---------
[0]       12044  TARGET.TotalDisk >= 21000000
[1]       11993  TARGET.IsSL7WN is true
[2]       11981  [0] && [1]
[3]        6427  TARGET.AGLT2_SITE == "UM"
[4]        6354  [2] && [3]WARNING: Analysis is meaningless for Scheduler universe jobs.1283.000:  This schedd's StartSchedulerUniverse evalutes to true for this job.1283.000:  Run analysis summary ignoring user priority.  Of 1 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
      0 are able to run your job



[2]

ALWAYS_VM_UNIV_USE_NOBODY = false

DEFAULT_UNIVERSE = 

ENABLE_KERNEL_TUNING = true

IsMPI = (TARGET.JobUniverse == $(MPI))

IsStandard = (TARGET.JobUniverse == $(STANDARD))

IsVanilla = (TARGET.JobUniverse == $(VANILLA))

IsVM = (TARGET.JobUniverse == $(VM))

KERNEL_TUNING_LOG = $(LOG)/KernelTuning.log

LINUX_KERNEL_TUNING_SCRIPT = $(LIBEXEC)/linux_kernel_tuning

LOCAL_UNIV_EXECUTE = $(SPOOL)/local_univ_execute

LOCAL_UNIVERSE_JOB_CLEANUP_RETRY_DELAY = 30

LOCAL_UNIVERSE_MAX_JOB_CLEANUP_RETRIES = 5

SCHED_UNIV_RENICE_INCREMENT = 0

START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 200

START_SCHEDULER_UNIVERSE = 500

SYSTEM_STARTD_JOB_ATTRS = ImageSize, ExecutableSize, JobUniverse, NiceUser, CPUsUsage, ResidentSetSize, ProportionalSetSizeKb, MemoryUsage, DiskUsage, ScratchDirFileCount

SYSTEM_VALID_SPOOL_FILES = job_queue.log, job_queue.log.tmp, history, Accountant.log, Accountantnew.log, local_univ_execute, .pgpass, .schedd_address, .schedd_address.super, .schedd_classad, OfflineLog

UNICORE_GAHP = $(SBIN)/unicore_gahp

VM_UNIV_NOBODY_USER = 

 


Wendy/Wenjing