Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] standard universe jobs won't start but vanilla are OK
- Date: Tue, 22 Nov 2011 10:29:32 -0500 (EST)
- From: Tim St Clair <tstclair@xxxxxxxxxx>
- Subject: Re: [Condor-users] standard universe jobs won't start but vanilla are OK
To help debugging wrap your requirements expression in a debug() function, then examine the StartLog
e.g.-
Requirements = debug( ( Arch == "Intel" ) && ( OpSys == "LINUX" ) ) && ( (
CkptArch == TARGET.Arch ) || ( CkptArch =?= undefined ) ) && ( (
CkptOpSys == TARGET.OpSys ) || ( CkptOpSys =?= undefined ) ) && (
TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >=
ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize )
It will narrow down how the expression is being evaluated.
Cheers,
Tim
----- Original Message -----
> From: "Ian Smith" <I.C.Smith@xxxxxxxxxxxxxxx>
> To: "Condor-Users Mail List" <condor-users@xxxxxxxxxxx>
> Sent: Tuesday, November 22, 2011 5:59:11 AM
> Subject: [Condor-users] standard universe jobs won't start but vanilla are OK
>
> I've been banging my head against the wall for a couple for days
> on the problem below so hopefully someone on the list may be able to
> help ...
>
> I'm trying to run a very simple test job on a Debian based VM (under
> colinux) - in fact it's just a shell script. If I submit it
> as a vanilla universe job then everything is fine but as a
> standard universe job it seems to match OK on the central manager
> but then the startd on the execute host seems to reject it and gives
> this message:
>
> Job Requirements check failed!
>
> Bearing that in mind I've set STARTD_DEBUG to D_JOB to try and
> compare
> the two jobs. For the vanilla universe I see this:
>
> Requirements = ( ( Arch == "Intel" ) && ( OpSys == "LINUX" ) ) && (
> TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >=
> ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize ) && (
> TARGET.HasFileTransfer)
>
> and for the standard universe this
>
> Requirements = ( ( Arch == "Intel" ) && ( OpSys == "LINUX" ) ) && ( (
> CkptArch == TARGET.Arch ) || ( CkptArch =?= undefined ) ) && ( (
> CkptOpSys == TARGET.OpSys ) || ( CkptOpSys =?= undefined ) ) && (
> TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >=
> ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize )
>
> So my guess is that it's something to do with CkptArch/CkptOpSys ???
>
> The machine requirements are the same in both cases viz:
>
> IsValidCheckpointPlatform = ( ( ( TARGET.JobUniverse == 1 ) == false
> ) || ( ( MY.CheckpointPlatform =!= undefined ) && ( (
> TARGET.LastCheckpointPlatform =?= MY.CheckpointPlatform ) || (
> TARGET.NumCkpts == 0 ) ) ) )
> Requirements = ( START ) && ( IsValidCheckpointPlatform )
>
> In the job classad I've set
>
> +WantCheckpoint = false
>
> but I get the same problem regardless of whether this is explicity
> set or not.
>
> The combined central manager / submit host is a Scientific Linux 6.1
> X84_64 system if
> that's relevant.
>
> Any pointers or vague vague hints even would be most appreciated as
> I've totally
> run out of ideas.
>
> cheers,
>
> -ian.
>
> ---------------------------------------
> Dr Ian C. Smith,
> Advanced Research Computing,
> University of Liverpool UK.
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>