Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] condor 6.8.2 + RHEL 4 - jobs stay idle, never run
- Date: Tue, 21 Nov 2006 09:20:23 -0800
- From: Lee Damon <nomad@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] condor 6.8.2 + RHEL 4 - jobs stay idle, never run
I've found a box that does have better-analyze available:
( target.NikolaHost == "noddy" ) &&
( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) ) && ( target.Arch == "INTEL" ) &&
( target.OpSys == "LINUX" ) && ( target.Disk >= DiskUsage ) &&
( ( target.Memory * 1024 ) >= ImageSize ) &&
( TARGET.FileSystemDomain == MY.FileSystemDomain )
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) )0 REMOVE
2 ( target.NikolaHost == "noddy" ) 1
3 ( target.Arch == "INTEL" ) 364
4 ( target.OpSys == "LINUX" ) 377
5 ( target.Disk >= 10000 ) 385
6 ( ( 1024 * target.Memory ) >= 10000 )385
7 ( TARGET.FileSystemDomain == "ee.washington.edu" )
385
This is exactly the same set up as the (working) 6.6.10 inplementation.
The following four lines are in /etc/condor/condor_config:
RESOURCE_GROUP = "ssli"
JOB_GROUP = "ssli"
SUBMIT_EXPRS = JOB_GROUP
STARTD_EXPRS = RESOURCE_GROUP
The requirement part of the condor_config is:
IS_ALLOWED = ( \
MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
MY.RESOURCE_GROUP == TARGET.USER_GROUP || \
MY.RESOURCE_GROUP == "ssli" \
)
IS_LOCAL = ( \
MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
MY.RESOURCE_GROUP == TARGET.USER_GROUP \
)
START = $(UWCS_START) && $(IS_ALLOWED)
RANK = $(IS_LOCAL)
"ssli" or "vlsi" or "mtml", etc is filled in by the script that installs
the condor_config on the host.
When I remove the NikolaHost requirement this particular box actually
sends jobs to the 6.6.10 pool just fine. Noddy is a 32-bit system
running RHEL 4 with Condor 6.8.2. The boxes that are not sending jobs
out at all are 64-bit boxes so I can understand why they would
not be sending jobs to the 32-bit 6.6.10 systems.
What I don't understand is why this requirement works in 6.6.10 but not
in 6.8.2.
nomad
>> 6.8.2:
>>
>> Requirements = (START) && (IsValidCheckpointPlatform)
>
>IsValidCheckpointPlatform is automatically inserted by the startd, but
>it should evaluate to true for any vanilla job. What does condor_q
>-better-analyze say?
>
>-Greg