|
Hi Jeff,
I believe one could argue that there is bug 1. The internals of CPU_UTIL caps the value at 100 thus making an 8-core job using 2 cores effectively report 100% rather than 200% of the desired 800%. I am not sure why there is a cap at 100% , but I assume there
is some historical thought process there. You could manually do the CPU_UTIL ClassAd _expression_ without the max in your custom print format file as follows:
(RemoteUserCpu/CommittedTime) * 100.0 AS 'CPutil' PRINTF '%3.2f%%'
Cheers,
Cole Bollig
From: Jeff Templon <templon@xxxxxxxxx>
Sent: Saturday, January 10, 2026 4:59 AM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Cc: Cole Bollig <cabollig@xxxxxxxx> Subject: Re: [HTCondor-users] Problems with cpu accounting Hi Cole,
It’s a bug, I just can’t figure out which kind :-) Bug 1) it is literally doing what the doc says, meaning that if an 8-core job uses two cores effectively and the other 6 empty, 100% is reported since the actual answer is 200% (of the desired 800%) Bug 2) is that you left out a cpu-normalisation in your explanation, meaning that if an 8-core job uses two cores effectively and the other 6 empty, 25% will be reported; the bug is that 95.7% should be reported, but 100% is reported. Which of the two bugs is it, and what is the perspective for a fix? JT > On 9 Jan 2026, at 17:34, Cole Bollig via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote: > > Hi Jeff, > > It appears that our documentation for CPU_UTIL print format is misleading. Digging into the code CPU_UTIL is a percentage, so the code is doing(RemoteUserCpu / CommittedTime) * 100 with a max of 100 and min of 0. > > Cheers, > Cole BolligFrom: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jeff Templon <templon@xxxxxxxxx> > Sent: Friday, January 9, 2026 7:52 AM > To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> > Subject: [HTCondor-users] Problems with cpu accounting > Hi, > > We have three measures of cpu usage, all three giving a different result: > > CPU_UTIL from the print format -> 100% > CpusUsage as a job attribute -> 3.169 > RemoteUserCpu / CommittedTime -> 3.827 > > See the command output under the quoted message excerpt below. > > RemoteUserCpu / CommittedTime is what the documentation claims will be printed by CPU_UTIL. It’s not. 3.827 is what I’d qualitatively expect from this workload. > > What is going on???? > > JT > > >> On 9 Jan 2026, at 13:47, Emily Kooistra <a66@xxxxxxxxx> wrote: >> >> So that CPutil is, >> >> RemoteUserCpu / CommittedTime >> >> So i guess print both and see? >> > > Here: > > └> condor_history -completedsince $(date -d "64 minutes ago" +"%s") -print-format /user/templon/yafu_htcondor/cputests.cpf -wide:164 -constraint 'Owner=="templon"' > JOB_ID Username Class CMD Finished Started CPUS CPuse RemUsCpu CommTime CPutil MEMREQ MEM ST WALL_TIME NStrt WorkerNode > 3823143.0 templon long yafu-b637.condor b 1/9 14:27 1/9 13:52 4 3.169 2:12:51 34:43 100.0 128.0 GB 366.2 MB C 34:43 1 wn-pijl-005 > > So CPutil lies, because RemUsCpu / CommTime = 3.827 while CPutil is100% and CPuse (CpusUsage) is 3.169 > > JT > > Note: the cpf file is > > $ cat cputests.cpf > SELECT NOSUMMARY > ClusterId AS JOB_ID PRINTAS JOB_ID WIDTH -11 > Owner AS 'Username' > JobCategory AS Class WIDTH 5 > join(" ",split(Cmd,"/")[size(split(Cmd,"/"))-1], Args) AS ' CMD' WIDTH -18 > CompletionDate AS ' Finished ' PRINTAS DATE > JobCurrentStartDate AS ' Started' PRINTAS QDATE > CpusProvisioned AS 'CPUS' WIDTH 3 > CpusUsage AS 'CPuse' WIDTH 5 > interval(RemoteUserCpu) AS " RemUsCpu" WIDTH 10 > interval(CommittedTime) AS " CommTime" WIDHT 10 > Dummy AS 'CPutil' PRINTAS CPU_UTIL > MemoryProvisioned AS ' MEMREQ' PRINTAS READABLE_MB > # ResidentSetSize AS ' RAM' PRINTAS READABLE_KB WIDTH 8 > ImageSize AS ' MEM' PRINTAS READABLE_KB WIDTH 10 > JobStatus AS "ST" PRINTAS JOB_STATUS WIDTH 3 > interval(RemoteWallClockTime) AS " WALL_TIME" WIDTH 10 > JobRunCount AS 'NStrt' WIDTH 5 > split(splitSlotName(LastRemoteHost)[1], ".")[0] AS “WorkerNode" > > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > > The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/ |