Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Massive overinflation of RemoteUserCpu & RemoteSysCpu
- Date: Tue, 17 Sep 2024 14:51:38 +0200
- From: Petr Vokac <petr.vokac@xxxxxxx>
- Subject: Re: [HTCondor-users] Massive overinflation of RemoteUserCpu & RemoteSysCpu
On 9/17/24 14:29, Thomas Hartmann wrote:
I had observed some PIDs from previous jobs in non-interuptable sleep
staying in a job cgroup, that got "re-used" for a new follow up job.
I.e., that a slot's cgroup PID list contains PID(s) from a previous
job, that had run on the same local slot/cgroup.
In the end, I assume that these PIDs are no issue as they are
practically dead and will never awake from their sleep...
Yes, old PIDs left from previous jobs don't utilize any resources,
because they are stuck in non-interuptable sleep, but "cpu.stat" in
existing cgroupv2 is not cleared for new job => systime and usertime for
jobs running in existing subgroup gets time from all previous jobs... so
CPU account is completely wrong.
Petr