Re: [HTCondor-users] HTcondor disk resource related queries

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Hello Experts,

I am testing this configuration to put the jobs on hold breaching the disk limit.

STARTD_JOB_ATTRS = $(STARTD_JOB_ATTRS) RequestDisk
DISK_USAGE_EXCEEDED = (JobUniverse =!=13 && DiskUsage =!= UNDEFINED && DiskUsage > RequestDisk)
WANT_HOLD = $(DISK_USAGE_EXCEEDED)
WANT_HOLD_REASON = "Job exceeded disk usage limits"

I clearly see the jobs are using more than RequestDisk size still they are not getting held.

# condor_who -af:h globaljobid disk DiskUsage TotalDisk TotalSlotDisk RequestDisk

globaljobid disk DiskUsage TotalDisk TotalSlotDisk RequestDisk
test.example.com#412.0#1685567906 21356484 8192026 4271296648 21356484.0 16777216
test.example.com#413.0#1685567923 12813890 8192026 4271296648 12813890.0 8388608
test.example.com#414.0#1685567952 8542594 8192026 4271296648 8542594.0 3250000
test.example.com#415.0#1685568493 8542594 8192025 4271296648 8542594.0 3250000
test.example.com#416.0#1685568803 12813890 8192026 4271296648 12813890.0 10000000
test.example.com#417.0#1685568954 4271297 8192025 4271296648 4271297.0 1

9.0.17 is htcondor version I am using.

Thanks & Regards,

Vikrant Aggarwal

On Tue, May 30, 2023 at 1:09âPM Vikrant Aggarwal <ervikrant06@xxxxxxxxx> wrote:

Hello Experts,

Couple of queries:

- Why it's showing negative value for primary partitionable slot.

# condor_status `hostname` -server
Name OpSys Arch LoadAv Memory Disk Mips KFlops

slot1@xxxxxxxxxxxxxxxxxxxxxxxxxx LINUX X86_64 0.000 211398 -25210961 25601 1764976
slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxxx LINUX X86_64 0.000 19218 4278313 25601 1764976
slot1_2@xxxxxxxxxxxxxxxxxxxxxxxxxx LINUX X86_64 0.000 19218 4278313 25601 1764976

Machines Avail Memory Disk MIPS KFLOPS

X86_64/LINUX 3 3 249834 18446744073692897281 76803 5294928

Total 3 3 249834 18446744073692897281 76803 5294928

# condor_status -compact `hostname` -af Disk
4269756335

- I have this on worker node conf to modify the job request disk to mentioned value but it never worked. We are using similar _expression_ for cpu and memory, it works fine.

# condor_config_val MODIFY_REQUEST_EXPR_REQUESTDISK
80000

Not sure from where it's picking this value.

# grep -r 'Disk =' /spare/condor/dir_14*/.machine.ad
/spare/condor/dir_1417831/.machine.ad:Disk = 4278313
/spare/condor/dir_1417831/.machine.ad:TotalDisk = 4278312960
/spare/condor/dir_1417831/.machine.ad:TotalSlotDisk = 4278313.0
/spare/condor/dir_1425169/.machine.ad:Disk = 4278313
/spare/condor/dir_1425169/.machine.ad:TotalDisk = 4278312960
/spare/condor/dir_1425169/.machine.ad:TotalSlotDisk = 4278313.0

# du -sh /spare/condor/dir_1425169
3.0G /spare/condor/dir_1425169

Thanks & Regards,

Vikrant Aggarwal

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Mailing List Archives

Authenticated access

Re: [HTCondor-users] HTcondor disk resource related queries