[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problems with cpu accounting



Hi Jeff,

It appears that our documentation for CPU_UTIL print format is misleading. Digging into the code CPU_UTIL is a percentage, so the code is doing (RemoteUserCpu / CommittedTime) * 100 with a max of 100 and min of 0.

Cheers,
Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jeff Templon <templon@xxxxxxxxx>
Sent: Friday, January 9, 2026 7:52 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Problems with cpu accounting
 
Hi,

We have three measures of cpu usage, all three giving a different result:

CPU_UTIL from the print format -> 100%
CpusUsage as a job attribute -> 3.169
RemoteUserCpu / CommittedTime -> 3.827 

See the command output under the quoted message excerpt below.

RemoteUserCpu / CommittedTime is what the documentation claims will be printed by CPU_UTIL.  It’s not.  3.827 is what I’d qualitatively expect from this workload.

What is going on????

JT


On 9 Jan 2026, at 13:47, Emily Kooistra <a66@xxxxxxxxx> wrote:

So that CPutil is,

RemoteUserCpu /  CommittedTime

So i guess print both and see?


Here:

└> condor_history  -completedsince $(date -d "64 minutes ago" +"%s") -print-format /user/templon/yafu_htcondor/cputests.cpf -wide:164 -constraint 'Owner=="templon"'
JOB_ID      Username Class    CMD               Finished     Started CPUS CPuse   RemUsCpu  CommTime CPutil    MEMREQ    MEM     ST   WALL_TIME NStrt WorkerNode
3823143.0   templon   long yafu-b637.condor b  1/9  14:27  1/9  13:52   4 3.169    2:12:51 34:43      100.0 128.0 GB    366.2 MB  C       34:43     1 wn-pijl-005

So CPutil lies, because RemUsCpu / CommTime = 3.827 while CPutil is100% and CPuse (CpusUsage) is 3.169

JT

Note: the cpf file is

$ cat cputests.cpf
SELECT NOSUMMARY
   ClusterId                      AS JOB_ID PRINTAS JOB_ID WIDTH -11
   Owner                          AS 'Username'
   JobCategory                    AS Class WIDTH 5
   join(" ",split(Cmd,"/")[size(split(Cmd,"/"))-1], Args)  AS '   CMD' WIDTH -18
   CompletionDate                 AS '  Finished ' PRINTAS DATE
   JobCurrentStartDate            AS '   Started' PRINTAS QDATE
   CpusProvisioned                AS 'CPUS'  WIDTH 3
   CpusUsage                      AS 'CPuse' WIDTH 5
   interval(RemoteUserCpu)        AS  "  RemUsCpu"    WIDTH 10
   interval(CommittedTime)        AS  " CommTime"    WIDHT 10
   Dummy                          AS 'CPutil' PRINTAS CPU_UTIL
   MemoryProvisioned              AS '   MEMREQ' PRINTAS READABLE_MB
#   ResidentSetSize                AS '   RAM' PRINTAS READABLE_KB  WIDTH 8
   ImageSize                      AS '   MEM' PRINTAS READABLE_KB  WIDTH 10
   JobStatus                      AS "ST"        PRINTAS JOB_STATUS WIDTH 3
    interval(RemoteWallClockTime)  AS  " WALL_TIME"   WIDTH 10
   JobRunCount                    AS 'NStrt' WIDTH 5
   split(splitSlotName(LastRemoteHost)[1], ".")[0]  AS  “WorkerNode"