Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] condor 6.8.2 + RHEL 4 - jobs stay idle, never run
- Date: Tue, 21 Nov 2006 09:28:29 -0800
- From: Lee Damon <nomad@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] condor 6.8.2 + RHEL 4 - jobs stay idle, never run
Of course, I missed an expression in condor_config when I sent this.
APPEND_REQUIREMENTS = ( \
MY.RESOURCE_GROUP == TARGET.JOB_GROUP \
)
>I've found a box that does have better-analyze available:
>
>( target.NikolaHost == "noddy" ) &&
>( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) ) && ( target.Arch == "INTEL" ) &&
>( target.OpSys == "LINUX" ) && ( target.Disk >= DiskUsage ) &&
>( ( target.Memory * 1024 ) >= ImageSize ) &&
>( TARGET.FileSystemDomain == MY.FileSystemDomain )
>
> Condition Machines Matched Suggestion
> --------- ---------------- ----------
>1 ( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) )0 REMOVE
>2 ( target.NikolaHost == "noddy" ) 1
>3 ( target.Arch == "INTEL" ) 364
>4 ( target.OpSys == "LINUX" ) 377
>5 ( target.Disk >= 10000 ) 385
>6 ( ( 1024 * target.Memory ) >= 10000 )385
>7 ( TARGET.FileSystemDomain == "ee.washington.edu" )
> 385
>
>
>This is exactly the same set up as the (working) 6.6.10 inplementation.
>The following four lines are in /etc/condor/condor_config:
>
> RESOURCE_GROUP = "ssli"
> JOB_GROUP = "ssli"
> SUBMIT_EXPRS = JOB_GROUP
> STARTD_EXPRS = RESOURCE_GROUP
>
>
>The requirement part of the condor_config is:
>
> IS_ALLOWED = ( \
> MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
> MY.RESOURCE_GROUP == TARGET.USER_GROUP || \
> MY.RESOURCE_GROUP == "ssli" \
> )
>
> IS_LOCAL = ( \
> MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
> MY.RESOURCE_GROUP == TARGET.USER_GROUP \
> )
>
> START = $(UWCS_START) && $(IS_ALLOWED)
> RANK = $(IS_LOCAL)
>
>
>
>"ssli" or "vlsi" or "mtml", etc is filled in by the script that installs
>the condor_config on the host.
>
>When I remove the NikolaHost requirement this particular box actually
>sends jobs to the 6.6.10 pool just fine. Noddy is a 32-bit system
>running RHEL 4 with Condor 6.8.2. The boxes that are not sending jobs
>out at all are 64-bit boxes so I can understand why they would
>not be sending jobs to the 32-bit 6.6.10 systems.
>
>What I don't understand is why this requirement works in 6.6.10 but not
>in 6.8.2.
>
>nomad
>
>>> 6.8.2:
>>>
>>> Requirements = (START) && (IsValidCheckpointPlatform)
>>
>>IsValidCheckpointPlatform is automatically inserted by the startd, but
>>it should evaluate to true for any vanilla job. What does condor_q
>>-better-analyze say?
>>
>>-Greg
nomad