[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] condor_q -analyze and -better-analyze : what am I missing?



Hi,

The analysis of a long-waiting job:

-- Schedd: taai-007.nikhef.nl : <145.107.7.246:9618?...
The Requirements _expression_ for job 214860.000 is

    (Machine != "wn-pijl-002.nikhef.nl") && (Machine != "wn-lot-001.nikhef.nl")

Job 214860.000 defines the following attributes:


The Requirements _expression_ for job 214860.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]         867  Machine != "wn-pijl-002.nikhef.nl"
[1]         857  Machine != "wn-lot-001.nikhef.nl"
[2]         851  [0] && [1]

No successful match recorded.
Last failed match: Thu Jul 11 13:12:52 2024

Reason for last match failure: no match found

214860.000:  Run analysis summary ignoring user priority.  Of 86 machines,
      2 are rejected by your job's requirements
     31 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
     53 are able to run your job

The job is asking for 64 cores, there are 57 with 64 cores, two of them are rejected by [0] and [1], and two more are draining, so 53 are âtheoreticallyâ able to run my job, if were not for all the jobs due to other users already running on those nodes.  There ARE, however, single core slots available on all 53 of those nodes - itâs as if it makes the comparison RequestCpus vs TotalCpus, but then does not make the comparison per actual slot (we have partitionable slots) on RequestCpus vs Cpus â and strange that Cpus are not mentioned as part of the analysis.

Is this a feature, a bug, or a misconfiguration on our part?

JT