Hi
this is a follow-up to my last email, but addressing a completely different
issue.
I'm using a job submit file as follows:
Executable = sleep.gpu.sh
Arguments = 10 $$(GPU_DEV) $$(GPU_NAME) $$(GPU_CAPABILITY) \
$$(GPU_GLOBALMEM_MB) $$(GPU_MULTIPROC) $$(GPU_NUMCORES) $$(GPU_CLOCK_GHZ) \
$$(GPU_CUDA_DRV) $$(GPU_CUDA_RUN)
Error = logs/err.$(Process)
Output = logs/out.$(Process)
Log = /local/user/carsten/foo.log
Requirements = GPU_CAPABILITY>= 1.9
+WantGPU=True
Universe = vanilla
Queue 1
where sleep.gpu.sh is only printing out the arguments and sleeping for $1
seconds.
With "Requirements = GPU_CAPABILITY>= 2.0" I'm trying to steer it to a
machine which has this one set. It kind of works, the match is made, but when
the startd wants to start the job, it just says "slot1: Job Requirements check
failed!" and goes back to idle (full debug startLog attached).
$ gpu010:/var/log/condor# grep -i require /tmp/StartLog
AutoClusterAttrs =
"JobUniverse,LastCheckpointPlatform,NumCkpts,Scheduler,Owner,NeedGpu,WantGPU,DiskUsage,ImageSize,RequestMemory,FileSystemDomain,Requirements,NiceUser,ConcurrencyLimits"
Requirements = (GPU_CAPABILITY>= 2.000000)&& (Arch == "X86_64")&& (OpSys ==
"LINUX")&& (Disk>= DiskUsage)&& ((Memory * 1024)>= ImageSize)&&
((RequestMemory * 1024)>= ImageSize)&& (TARGET.FileSystemDomain ==
MY.FileSystemDomain)
Requirements = (START)&& (IsValidCheckpointPlatform)
AutoClusterAttrs =
"JobUniverse,LastCheckpointPlatform,NumCkpts,Scheduler,Owner,NeedGpu,WantGPU,DiskUsage,ImageSize,RequestMemory,FileSystemDomain,Requirements,NiceUser,ConcurrencyLimits"
Requirements = (GPU_CAPABILITY>= 2.000000)&& (Arch == "X86_64")&& (OpSys ==
"LINUX")&& (Disk>= DiskUsage)&& ((Memory * 1024)>= ImageSize)&&
((RequestMemory * 1024)>= ImageSize)&& (TARGET.FileSystemDomain ==
MY.FileSystemDomain)
Requirements = (START)&& (IsValidCheckpointPlatform)
12/10 10:43:42 slot1: Job Requirements check failed!
I'm not quite sure what is causing Condor to not start the job, at first I
thought it might be the floating-point comparison, but even with
Requirements = GPU_CAPABILITY>= 1.9
it matches, but does not start.
Any ideas?
Cheers
Carsten