[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Weirdness with cpususage



Hi Jeff, Jaime,

While it doesnât really answer anything, I can confirm that we too are seeing that CpusUsage *in the history* is often complete bogus. Thereâs an old htcondor-users thread [0] about that, but without any conclusive resolution.

Would be interesting if someone can figure out whatâs going on.

Cheers
Max

[0] https://www-auth.cs.wisc.edu/lists/htcondor-users/2024-March/msg00034.shtml

On 30. Oct 2024, at 10:17, Jeff Templon <templon@xxxxxxxxx> wrote:

Thanks Jaime,

Your explanation is helpful, it doesnât explain the effect AFAICT, do you agree?  See here:

â> condor_history  -completedsince $(date -d "4 days ago" +"%s") -print-format ~templon/done-cpususage.cpf -wide:148 | head
JOB_ID    Username    CMD               Finished   CPUS  CPuse  MEMREQ    RAM      MEM     ST   CPU_TIME    LWALL_TIME    WALL_TIME  WorkerNode
715700.0 roystege    pineko theory -c / 10/30 10:00  32 3.612 264.0 GB   7.2 GB     7.2 GB C       1:33:59         4:25         4:25 wn-knek-014
715686.0 roystege    pineko theory -c / 10/30 09:59  32 31.38 264.0 GB   9.8 MB     9.8 MB C             0      1:44:41      1:44:41 wn-lot-064
715557.0 roystege    pineko theory -c / 10/30 09:59  32 0.280 264.0 GB   9.5 GB     9.5 GB C    9+02:42:51     11:13:46     11:13:46 wn-knek-016
715699.0 roystege    pineko theory -c / 10/30 09:55  32   0.0 264.0 GB   3.4 MB   195.3 MB C             6            4            4 wn-knek-014

LWALL_TIME and WALL_TIME - the âLâ is for the âLastâ as you suggest.  They are identical in these cases.  See the second line above : âCPuseâ is 31.38, according to you itâs "computed from the cpu usage and wall clock time of just the last execution of the job (no file transfer time included)â.  The CPU used is â0â according to the output.  The next line, the CPU TIME is about 18 times the wall times, but CPuse is only 0.280.  Do you see how both of these are possible without something being wrong?

Thanks,

JT

Ps see the CPF file below, itâs slightly different than in the previous mail

SELECT NOSUMMARY
   ClusterId                      AS  JOB_ID      PRINTAS JOB_ID
   Owner                          AS '   Username'
   join(" ",split(Cmd,"/")[size(split(Cmd,"/"))-1], Args)  AS '   CMD' WIDTH -18
   CompletionDate                 AS '  Finished ' PRINTAS DATE
   CpusProvisioned                AS ' CPUS'  WIDTH 3
   CpusUsage                      AS ' CPuse' WIDTH 5
   MemoryProvisioned              AS ' MEMREQ' PRINTAS READABLE_MB  
   ResidentSetSize                AS '   RAM' PRINTAS READABLE_KB  WIDTH 8
   ImageSize                      AS '   MEM' PRINTAS READABLE_KB  WIDTH 10
   JobStatus                      AS "ST"        PRINTAS JOB_STATUS
   interval(RemoteUserCpu)        AS  "  CPU_TIME"    WIDTH 12
   interval(LastRemoteWallClockTime)  AS  " LWALL_TIME"   WIDTH 12
   interval(RemoteWallClockTime)  AS  "  WALL_TIME"   WIDTH 12
   split(splitSlotName(LastRemoteHost)[1], ".")[0]  AS  "WorkerNode" WIDTH -12


On 25 Oct 2024, at 18:08, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

There are a few confounding factors with using these job attributes. RemoteWallClockTime records the time across all execution attempts for the job, whereas RemoteUserCpu records the time from just the last execution attempt. To get the wall clock time for just the last execution, you should use LastRemoteWallClockTime. Also, RemoteWallClockTime and LastRemoteWallClockTime include time spent doing file transfer.
CpusUsage is computed from the cpu usage and wall clock time of just the last execution of the job (no file transfer time included).

 - Jaime


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature