Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Negative kflops benchmarks (-2147483648) in HTCondor 25.0 LTS
- Date: Tue, 13 Jan 2026 16:58:04 +0100
- From: Stefan Mohn <mohn@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Negative kflops benchmarks (-2147483648) in HTCondor 25.0 LTS
Hi Greg,
Am Mon, Jan 12, 2026 at 05:08:24PM -0600, schrieb Greg Thain via HTCondor-users:
> On 1/9/26 12:48, Stefan Mohn via HTCondor-users wrote:
> > Hello,
> >
> > I would like to report an issue of our institutes HTCondor pool.
> >
> > Some nodes are not receiving any jobs and report unexpectedly negative benchmark values, with kflops = -2147483648. Multiple restarts of the condor service on the ten affected nodes only resolved the issue on 20 % of them. As a workaround, I generated artificial load on the nodes during the benchmarking process, which temporarily resolved the issue. However, the problem reappears whenever the condor service is restarted without artificial load present.
>
> This looks like a different (though perhaps related?) problem than the
> earlier one. When the startd computes the kflops, it just runs the
> condor_kflops program. Does the problem ever happen on a standalone run of
> "condor_kflops" ?
yes, condor_kflops actually shows me the -2147483648 as result, when
running standalone. Sometimes.
Although I haven't seen it before, it gave me these results when running
condor_kflops in a loop.
I hope it's not to much to post my results here for this issue:
root@boa:~# i=1; while true; do echo $i: `/usr/libexec/condor/condor_kflops | grep KFlops` `date` ; i=$(($i+1)); done
1: KFlops = -2147483648 Tue Jan 13 16:18:40 CET 2026
2: KFlops = 2492250 Tue Jan 13 16:19:06 CET 2026
3: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
4: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
5: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
6: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
7: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
8: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
9: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
10: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
11: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
12: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
13: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
14: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
15: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
16: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
17: KFlops = -2147483648 Tue Jan 13 16:19:06 CET 2026
18: KFlops = -2147483648 Tue Jan 13 16:19:07 CET 2026
19: KFlops = -2147483648 Tue Jan 13 16:19:07 CET 2026
20: KFlops = -2147483648 Tue Jan 13 16:19:07 CET 2026
21: KFlops = 2641825 Tue Jan 13 16:19:31 CET 2026
22: KFlops = -2147483648 Tue Jan 13 16:19:31 CET 2026
23: KFlops = 2623610 Tue Jan 13 16:19:56 CET 2026
24: KFlops = 2636171 Tue Jan 13 16:20:21 CET 2026
25: KFlops = 2632379 Tue Jan 13 16:20:47 CET 2026
26: KFlops = 2638033 Tue Jan 13 16:21:12 CET 2026
27: KFlops = 2632940 Tue Jan 13 16:21:37 CET 2026
28: KFlops = 2631667 Tue Jan 13 16:22:02 CET 2026
29: KFlops = 2635359 Tue Jan 13 16:22:27 CET 2026
30: KFlops = 2633526 Tue Jan 13 16:22:52 CET 2026
31: KFlops = 2633656 Tue Jan 13 16:23:18 CET 2026
32: KFlops = 2631147 Tue Jan 13 16:23:43 CET 2026
33: KFlops = 2634787 Tue Jan 13 16:24:08 CET 2026
34: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
35: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
36: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
37: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
38: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
39: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
40: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
41: KFlops = -2147483648 Tue Jan 13 16:24:08 CET 2026
42: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
43: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
44: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
45: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
46: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
47: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
48: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
49: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
50: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
51: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
52: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
53: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
54: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
55: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
56: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
57: KFlops = -2147483648 Tue Jan 13 16:24:09 CET 2026
58: KFlops = 2636001 Tue Jan 13 16:24:34 CET 2026
59: KFlops = -2147483648 Tue Jan 13 16:24:34 CET 2026
60: KFlops = -2147483648 Tue Jan 13 16:24:34 CET 2026
61: KFlops = 2628210 Tue Jan 13 16:24:59 CET 2026
62: KFlops = 2633669 Tue Jan 13 16:25:24 CET 2026
63: KFlops = 2633985 Tue Jan 13 16:25:49 CET 2026
64: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
65: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
66: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
67: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
68: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
69: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
70: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
71: KFlops = -2147483648 Tue Jan 13 16:25:49 CET 2026
72: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
73: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
74: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
75: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
76: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
77: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
78: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
79: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
80: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
81: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
82: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
83: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
84: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
85: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
86: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
87: KFlops = -2147483648 Tue Jan 13 16:25:50 CET 2026
88: KFlops = 2427306 Tue Jan 13 16:26:16 CET 2026
89: KFlops = 2633159 Tue Jan 13 16:26:41 CET 2026
90: KFlops = -2147483648 Tue Jan 13 16:26:41 CET 2026
91: KFlops = -2147483648 Tue Jan 13 16:26:41 CET 2026
92: KFlops = 2636277 Tue Jan 13 16:27:06 CET 2026
93: KFlops = 2635542 Tue Jan 13 16:27:31 CET 2026
94: KFlops = 2638784 Tue Jan 13 16:27:56 CET 2026
95: KFlops = 2638253 Tue Jan 13 16:28:21 CET 2026
96: KFlops = 2636385 Tue Jan 13 16:28:46 CET 2026
97: KFlops = 2635581 Tue Jan 13 16:29:11 CET 2026
98: KFlops = 2640928 Tue Jan 13 16:29:36 CET 2026
99: KFlops = 2636307 Tue Jan 13 16:30:01 CET 2026
100: KFlops = 2636285 Tue Jan 13 16:30:26 CET 2026
101: KFlops = -2147483648 Tue Jan 13 16:30:26 CET 2026
As you can see, condor_kflops gives me mixed results. A normal run with
a reasonable result takes around 25 seconds. However when it shows
-2147483648, the result comes immediately.
> For an immediate workaround, you can hardcode the kflops
> value in the startd config file.
Thank you very much, i will consider it.
Please let me know, if I can provide additional information about this
issue.
All the best
Stefan
--
Stefan Mohn
Technische Universität Berlin
Institut für Physik und Astronomie (IFPA)
Fachgruppe Theoretische Physik, EW 7-1
Hardenbergstr. 36
10623 Berlin
Germany
Raum: EW 706
Tel.: +49 30 314 77337
GPG: 0x4124228FF9450400