[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problems with cpu accounting



Hi Cole,

Itâs a bug, I just canât figure out which kind :-)

Bug 1) it is literally doing what the doc says, meaning that if an 8-core job uses two cores effectively and the other 6 empty, 100% is reported since the actual answer is 200% (of the desired 800%)
Bug 2) is that you left out a cpu-normalisation in your explanation, meaning that if an 8-core job uses two cores effectively and the other 6 empty, 25% will be reported; the bug is that 95.7% should be reported, but 100% is reported.

Which of the two bugs is it, and what is the perspective for a fix?

JT


> On 9 Jan 2026, at 17:34, Cole Bollig via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
> 
> Hi Jeff,
> 
> It appears that our documentation for CPU_UTIL print format is misleading. Digging into the code CPU_UTIL is a percentage, so the code is doing(RemoteUserCpu / CommittedTime) * 100 with a max of 100 and min of 0.
> 
> Cheers,
> Cole BolligFrom: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jeff Templon <templon@xxxxxxxxx>
> Sent: Friday, January 9, 2026 7:52 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Problems with cpu accounting
>  Hi,
> 
> We have three measures of cpu usage, all three giving a different result:
> 
> CPU_UTIL from the print format -> 100%
> CpusUsage as a job attribute -> 3.169
> RemoteUserCpu / CommittedTime -> 3.827 
> 
> See the command output under the quoted message excerpt below.
> 
> RemoteUserCpu / CommittedTime is what the documentation claims will be printed by CPU_UTIL.  Itâs not.  3.827 is what Iâd qualitatively expect from this workload.
> 
> What is going on????
> 
> JT
> 
> 
>> On 9 Jan 2026, at 13:47, Emily Kooistra <a66@xxxxxxxxx> wrote:
>> 
>> So that CPutil is,
>> 
>> RemoteUserCpu /  CommittedTime
>> 
>> So i guess print both and see?
>> 
> 
> Here:
> 
> â> condor_history  -completedsince $(date -d "64 minutes ago" +"%s") -print-format /user/templon/yafu_htcondor/cputests.cpf -wide:164 -constraint 'Owner=="templon"'
> JOB_ID      Username Class    CMD               Finished     Started CPUS CPuse   RemUsCpu  CommTime CPutil    MEMREQ    MEM     ST   WALL_TIME NStrt WorkerNode
> 3823143.0   templon   long yafu-b637.condor b  1/9  14:27  1/9  13:52   4 3.169    2:12:51 34:43      100.0 128.0 GB    366.2 MB  C       34:43     1 wn-pijl-005
> 
> So CPutil lies, because RemUsCpu / CommTime = 3.827 while CPutil is100% and CPuse (CpusUsage) is 3.169
> 
> JT
> 
> Note: the cpf file is
> 
> $ cat cputests.cpf
> SELECT NOSUMMARY
>    ClusterId                      AS JOB_ID PRINTAS JOB_ID WIDTH -11
>    Owner                          AS 'Username'
>    JobCategory                    AS Class WIDTH 5
>    join(" ",split(Cmd,"/")[size(split(Cmd,"/"))-1], Args)  AS '   CMD' WIDTH -18
>    CompletionDate                 AS '  Finished ' PRINTAS DATE
>    JobCurrentStartDate            AS '   Started' PRINTAS QDATE
>    CpusProvisioned                AS 'CPUS'  WIDTH 3
>    CpusUsage                      AS 'CPuse' WIDTH 5
>    interval(RemoteUserCpu)        AS  "  RemUsCpu"    WIDTH 10
>    interval(CommittedTime)        AS  " CommTime"    WIDHT 10
>    Dummy                          AS 'CPutil' PRINTAS CPU_UTIL
>    MemoryProvisioned              AS '   MEMREQ' PRINTAS READABLE_MB
> #   ResidentSetSize                AS '   RAM' PRINTAS READABLE_KB  WIDTH 8
>    ImageSize                      AS '   MEM' PRINTAS READABLE_KB  WIDTH 10
>    JobStatus                      AS "ST"        PRINTAS JOB_STATUS WIDTH 3
>     interval(RemoteWallClockTime)  AS  " WALL_TIME"   WIDTH 10
>    JobRunCount                    AS 'NStrt' WIDTH 5
>    split(splitSlotName(LastRemoteHost)[1], ".")[0]  AS  âWorkerNode"
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> 
> The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/