[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs Stuck Due to RequestDisk = undefined



Dear Genome,

thanks for your reply, but all my wns are X86_64:

[root@ce04 ~]# condor_status -format "%s\n" Arch|more
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64
X86_64


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of "ëêë" <geonmo@xxxxxxxxxxx>
Date: Tuesday, 15 April 2025 at 01:15
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Jobs Stuck Due to RequestDisk = undefined

Hello, Eraldo.


I think we already have an answer to your question.

```[0]           0  TARGET.Arch == "X86_64"```


I see that you don't have any servers with x86_64 architecture, so I'm wondering if your WNs are using ARM CPUs.


You might need to check that.


Regards,


-- Geonmo


ââââââ ìë ëì ââââââ

ëëìë : Eraldo Jr <eusoueraldo@xxxxxxxxx>

ëëìë : HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>

ëìëì : 2025-04-15 (í) 10:59:21

ìë : [HTCondor-users] Jobs Stuck Due to RequestDisk = undefined





Dears,

First of all:
$CondorVersion: 24.4.0 2025-02-02 BuildID: 784192 PackageID: 24.4.0-1 GitSHA: 6f17b75e $
$CondorPlatform: x86_64_AlmaLinux9 $

I am encountering an issue where several jobs are failing to match available slots, despite sufficient disk space being reported. The problem seems to be related to how RequestDisk is being evaluated in conjunction with dynamically allocated slots.
Issue Summary:
=======================================
The Requirements _expression_ for job 9867.000 is

    (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) &&
    (TARGET.Memory >= RequestMemory) && (TARGET.HasFileTransfer)

    [0]    : TARGET.Arch == "X86_64"
    [1]    : TARGET.OpSys == "LINUX"
    [2]    : [0] && [1]
    [3]    : TARGET.Disk >= RequestDisk
    [4]    : [2] && [3]
    [5]    : TARGET.Memory >= RequestMemory
    [6]    : [4] && [5]
    [7]    : TARGET.HasFileTransfer
    [8]    : [6] && [7]

Job 9867.000 defines the following attributes:

    DiskUsage = 40
    ImageSize = 40
    RequestDisk = undefined (kb)
    RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(ImageSize + 1023) / 1024) (mb)

The Requirements _expression_ for job 9867.000 reduces to these conditions:

        Slots
Step   Matched  Condition
----- --------- ---------
[0]           0  TARGET.Arch == "X86_64"
[1]           0  TARGET.OpSys == "LINUX"
[3]           0  TARGET.Disk >= RequestDisk
[5]           0  TARGET.Memory >= RequestMemory
[7]           0  TARGET.HasFileTransfer
=======================================
[root@ce04 ~]# condor_ce_q -l 9867 |grep Disk|more
DiskUsage = 40
DiskUsage_RAW = 39
RequestDisk = DiskUsage
Requirements = (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) &
& (TARGET.Memory >= RequestMemory) && (TARGET.HasFileTransfer)
=======================================
[root@ce04 ~]# _val -dump | grep REQUEST
e_config_val -dump | grep DiskUsage-bash: _val: command not found
[root@ce04 ~]# condor_ce_config_val -dump | grep DiskUsage
JOB_DEFAULT_REQUESTDISK = DiskUsage
SCHEDD_ROUND_ATTR_DiskUsage = 25%
SYSTEM_STARTD_JOB_ATTRS = ImageSize, ExecutableSize, JobUniverse, NiceUser, CPUsUsage, ResidentSetSize, ProportionalSetSizeKb, MemoryUsage, DiskUsage, ScratchDirFileCount
=======================================
[root@ce04 ~]#  condor_status -long | grep DiskUsage|more
DiskUsage = 236135
DiskUsage = 275046
DiskUsage = 249114
DiskUsage = 480483
DiskUsage = 377061
DiskUsage = 272785
DiskUsage = 306690
DiskUsage = 321216
DiskUsage = 294344
DiskUsage = 156775
DiskUsage = 256135
DiskUsage = 300316
DiskUsage = 306545
DiskUsage = 221228
DiskUsage = 311412

=======================================
I appreciate any guidance you can provide!
Best regards,
Eraldo