[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] jobs stay in queue forever with 'termination pending'



Hi,

our scheds get somehow clogged up with jobs that end up in job state 4 but do stay in 'termination pending == true' 

[root[root@mysched21 ~]# condor_q -constraint 'TerminationPending == true'

-- Schedd: mysched21.desy.de : <123.456.789:23521?... @ 10/11/24 08:52:52
OWNER    BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
myuser ID: 696613    7/23 15:52      _      _      _      1 696613.0
myuser ID: 696614    7/23 15:52      _      _      _      1 696614.0
<snip>

Total for query: 2874 jobs; 2874 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended 

As you can see these are aging quite well :( 

What causes theses jobs possibly to not being able to be finished and why does system-periodic-remove not finishes them at least ? 

Maybe it is related to cgroupsV2 and some cleanup there did not work as expected ? 

best
christoph

-- 
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx