Re: [HTCondor-users] Weirdness with cpususage

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Use of docker and cgroups changes how the cpu usage information is collected.

- Jaime

On Oct 30, 2024, at 4:17âAM, Jeff Templon <templon@xxxxxxxxx> wrote:

Thanks Jaime,

Your explanation is helpful, it doesnât explain the effect AFAICT, do you agree? See here:

â> condor_history -completedsince $(date -d "4 days ago" +"%s") -print-format ~templon/done-cpususage.cpf -wide:148 | head

JOB_ID Username CMD Finished CPUS CPuse MEMREQ RAM MEM ST CPU_TIME LWALL_TIME WALL_TIME WorkerNode

715700.0 roystege pineko theory -c / 10/30 10:00 32 3.612 264.0 GB 7.2 GB 7.2 GB C 1:33:59 4:25 4:25 wn-knek-014

715686.0 roystege pineko theory -c / 10/30 09:59 32 31.38 264.0 GB 9.8 MB 9.8 MB C 0 1:44:41 1:44:41 wn-lot-064

715557.0 roystege pineko theory -c / 10/30 09:59 32 0.280 264.0 GB 9.5 GB 9.5 GB C 9+02:42:51 11:13:46 11:13:46 wn-knek-016

715699.0 roystege pineko theory -c / 10/30 09:55 32 0.0 264.0 GB 3.4 MB 195.3 MB C 6 4 4 wn-knek-014

LWALL_TIME and WALL_TIME - the âLâ is for the âLastâ as you suggest. They are identical in these cases. See the second line above : âCPuseâ is 31.38, according to you itâs "computed from the cpu usage and wall clock time of just the last execution of the job (no file transfer time included)â. The CPU used is â0â according to the output. The next line, the CPU TIME is about 18 times the wall times, but CPuse is only 0.280. Do you see how both of these are possible without something being wrong?

Thanks,

JT

Ps see the CPF file below, itâs slightly different than in the previous mail

SELECT NOSUMMARY

ClusterId AS JOB_ID PRINTAS JOB_ID

Owner AS ' Username'

join(" ",split(Cmd,"/")[size(split(Cmd,"/"))-1], Args) AS ' CMD' WIDTH -18

CompletionDate AS ' Finished ' PRINTAS DATE

CpusProvisioned AS ' CPUS' WIDTH 3

CpusUsage AS ' CPuse' WIDTH 5

MemoryProvisioned AS ' MEMREQ' PRINTAS READABLE_MB

ResidentSetSize AS ' RAM' PRINTAS READABLE_KB WIDTH 8

ImageSize AS ' MEM' PRINTAS READABLE_KB WIDTH 10

JobStatus AS "ST" PRINTAS JOB_STATUS

interval(RemoteUserCpu) AS " CPU_TIME" WIDTH 12

interval(LastRemoteWallClockTime) AS " LWALL_TIME" WIDTH 12

interval(RemoteWallClockTime) AS " WALL_TIME" WIDTH 12

split(splitSlotName(LastRemoteHost)[1], ".")[0] AS "WorkerNode" WIDTH -12

On 25 Oct 2024, at 18:08, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

There are a few confounding factors with using these job attributes. RemoteWallClockTime records the time across all execution attempts for the job, whereas RemoteUserCpu records the time from just the last execution attempt. To get the wall clock time for just the last execution, you should use LastRemoteWallClockTime. Also, RemoteWallClockTime and LastRemoteWallClockTime include time spent doing file transfer.
CpusUsage is computed from the cpu usage and wall clock time of just the last execution of the job (no file transfer time included).

- Jaime

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Weirdness with cpususage