Good morning,
I recently enabled preemption and am seeing jobs remain idle in queue after being preempted while some jobs from the same set of jobs have already completed. I noticed that this happened to one user's jobs in particular after they were preempted by another user's jobs. They just stayed idle while there were nodes available to run them. Running 'condor_q -better-analyze' gives me the following.
The Requirements _expression_ for your job is:
( ( Memory > 512 ) ) && ( TARGET.Arch == "X86_64" ) &&
( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) &&
( TARGET.Memory >= RequestMemory ) && ( ( TARGET.HasFileTransfer ) ||
( TARGET.FileSystemDomain == MY.FileSystemDomain ) )
Your job defines the following attributes:
FileSystemDomain = "subdomain.domain.blah"
DiskUsage = 1
ImageSize = 1750000
MemoryUsage = 1221
RequestDisk = 1
RequestMemory = 1221
ResidentSetSize = 1250000
The Requirements _expression_ for your job reduces to these conditions:
Slots
Step Matched Condition
----- -------- ---------
[0] 96 Memory > 512
[1] 96 TARGET.Arch == "X86_64"
[3] 96 TARGET.OpSys == "LINUX"
[5] 96 TARGET.Disk >= RequestDisk
[7] 0 TARGET.Memory >= RequestMemory
[9] 96 TARGET.HasFileTransfer
Suggestions:
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( Memory > 512 ) ) 0 REMOVE
2 ( TARGET.Memory >= 1221 ) 0 MODIFY TO 977
3 ( TARGET.Arch == "X86_64" ) 96
4 ( TARGET.OpSys == "LINUX" ) 96
5 ( TARGET.Disk >= 1 ) 96
6 ( ( TARGET.HasFileTransfer ) || ( TARGET.FileSystemDomain == "subdomain.domain.blah" ) )
96
I found
this pageÂand verified that there is a memory requirement in the submit file. It is 'Requirements = (Memory > 512)'. I do not know how to keep HTCondor from adding this memory requirement to the job. Does anyone have suggestions? I can provide my condor_config file if needed. I left it pretty close to the default that ships with HTCondor.
Thanks,
Matt