Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Negative kflops benchmarks (-2147483648) in HTCondor 25.0 LTS
- Date: Mon, 12 Jan 2026 09:41:53 +0100
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] Negative kflops benchmarks (-2147483648) in HTCondor 25.0 LTS
On Fri, 2026-01-09 at 19:48:53 +0100, HTCondor Users Mailinglist wrote:
> Hello,
>
> I would like to report an issue of our institutes HTCondor pool.
>
> Some nodes are not receiving any jobs and report unexpectedly negative benchmark values, with kflops = -2147483648.
Hi Stefan,
although I seem to be unable to find the exact reference, I'm extremely positive
that this has indeed been discussed, and labelled a bug, a couple of weeks ago.
What I could find is HTCONDOR-3288 in Jira (dating back to Sep/Oct '25), that's
supposed to replace a "-1" with "MAX_INT" - as your output seems to confirm, but
did they get the sign wrong?
> benchmarks_joblist = mips kflops
> benchmarks_kflops_executable = $(LIBEXEC)/condor_kflops
> benchmarks_mips_executable = $(LIBEXEC)/condor_mips
>
> When running condor_kflops directly as root, it produces a reasonable value and doesn't show negative results:
>
> root@cymothex:~# /usr/libexec/condor/condor_kflops
> KFlops = 2608096
>
> We are currently running HTCondor version 25.0.3. Upgrading to 25.0.5 or downgrading to 25.0.2 does not resolve the issue. However with version 25.0.2 the kflops value changes to -1 instead.
>
> This behavior reminds me of the issue reported by John Veitch in September:
> [HTCondor-users] condor_kflops returning -1
> https://www-auth.cs.wisc.edu/lists/htcondor-users/2025-September/msg00063.shtml
>
> Below is some condor output:
>
> root@cymothex:~# condor_status -constraint "kflops < 0" -af:h name kflops mips condorversion
> name kflops mips condorversion
> slot1@cordycep -1 54597 $CondorVersion: 25.0.2 2025-10-08 BuildID: 840620 PackageID: 25.0.2-1+deb12 GitSHA: 24fb2387 $
> slot1@bancroft -2147483648 54153 $CondorVersion: 25.0.3 2025-10-31 BuildID: 847298 PackageID: 25.0.3-1+deb12 GitSHA: dc94bfbb $
> slot1@cymothex -2147483648 54565 $CondorVersion: 25.0.5 2025-12-12 BuildID: 856732 PackageID: 25.0.5-1+deb12 GitSHA: 5493979a $
Both values are nonsensical, obviously. I think there was a discussion (likely
around Sep 22, which is the date the issue was created) how to work around
the issue, and the workaround would still apply...
> I'm not so sure why this problem now pops up. We have similar nodes that don't show this problem. Some of the affected nodes may be better cooled due to the low temperatures outside. Do you have any hints or advice regarding this issue with condor_kflops?
Maybe throttling, or the lack thereof, has an effect on benchmark results?
Best,
S
PS.: It turns out I was looking for the "new" kflops value, not the old one...