Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] new to HTCONDOR & some dumb questions :(
- Date: Wed, 22 Apr 2015 13:02:52 -0500
- From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] new to HTCONDOR & some dumb questions :(
Hi Christoph,
Is it possible that a particular machine is not running any jobs (but others are)? There are various security things that could have gone wrong in the config.
Other things to check:
1) Look in the NegotiatorLogs.
2) Try "condor_q -better -reverse <ID>" to see whether a given machine matches your job (as opposed to jobs matching a machine). Recall that matching is bidirectional: machines must like the job and the job must like the machine.
Brian
> On Apr 22, 2015, at 4:01 AM, Beyer, Christoph <christoph.beyer@xxxxxxx> wrote:
>
>
> Hi,
>
> as stated above I am new to HTCONODOR and have some issues with a test installation that I can not seem to clear by myself.
>
> I run a pool with 5 nodes and one submit node. Everything is quite 'default' but when I am submitting a bunch of 'loop.remote' jobs, being alone on my pool I would think I can flood the whole thing but I never get more than 32 jobs running at any given time.
>
> Using condor_q I see that some slots seem to reject my job due to their own requirements (?)
>
> Is there any place I can look for these requirements if it's quota relate e.g (while actually there is no quota set)
>
> [chbeyer@$HOST]~% condor_q -better 179.9494
>
>
> -- Submitter: bm-test.desy.de : <$IP:58611> : $HOST.desy.de
> ---
> 179.9494: Request has not yet been considered by the matchmaker.
>
> User priority for chbeyer@xxxxxxx is not available, attempting to analyze without it.
> ---
> 179.9494: Run analysis summary. Of 53 machines,
> 0 are rejected by your job's requirements
> 42 reject your job because of their own requirements
> 0 match and are already running your jobs
> 1 match but are serving other users
> 10 are available to run your job
>
> The Requirements expression for your job is:
>
> ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
> ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) ) &&
> ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) ) &&
> ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )
>
> Your job defines the following attributes:
>
> DiskUsage = 7500
> ImageSize = 7500
> RequestDisk = 7500
> RequestMemory = 8
>
> The Requirements expression for your job reduces to these conditions:
>
> Slots
> Step Matched Condition
> ----- -------- ---------
> [0] 53 TARGET.Arch == "X86_64"
> [1] 53 TARGET.OpSys == "LINUX"
> [4] 53 CkptArch is undefined
> [8] 53 CkptOpSys is undefined
> [11] 53 TARGET.Disk >= RequestDisk
> [13] 53 TARGET.Memory >= RequestMemory
>
> Suggestions:
>
> Condition Machines Matched Suggestion
> --------- ---------------- ----------
> 1 ( TARGET.Arch == "X86_64" ) 53
> 2 ( TARGET.OpSys == "LINUX" ) 53
> 3 ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) )
> 53
> 4 ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) )
> 53
> 5 ( TARGET.Disk >= 7500 ) 53
> 6 ( TARGET.Memory >= ifthenelse(MemoryUsage isnt undefined,MemoryUsage,8) )
> 53
>
> The following attributes are missing from the job ClassAd:
>
> CheckpointPlatform
>
> In the logs I see a lot of errors like this:
>
> 04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257
> 04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257
>
> Any hints much appreciated !!!
>
> best regards
> ~christoph
>
>
> --
> /* Christoph Beyer | Office: Building 2b / 23 *\
> * DESY | Phone: 040-8998-2317 *
> * - IT - | Fax: 040-8994-2317 *
> \* 22603 Hamburg | http://www.desy.de */
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/