Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Fix for the possibly bogus "ceiling(ifThenElse(JobVMMemory ..." requirement issue
- Date: Thu, 20 Oct 2011 11:35:12 +1300
- From: Kevin.Buckley@xxxxxxxxxxxxx
- Subject: [Condor-users] Fix for the possibly bogus "ceiling(ifThenElse(JobVMMemory ..." requirement issue
Have recently looked ta deploying a 7.6.3 Condor to some win7
boxes.
Whist the Condor compute nodes seem to start up fine, removing
some problems that had been seen before on the win7 platform,
even the simplest of jobs, pass a .BAT file over and run it,
seem to fail to find a host, because, so condor_q -better_analyze
suggests of this:
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( 1024 * ceiling(ifThenElse(JobVMMemory isnt
undefined,JobVMMemory,9.765625000000000E-04)) ) >= 1 )
0 REMOVE
Looking around the interweb thing for
condor ceiling ifThenElse JobVMMemory
suggests there does not seem to be a clear answer out there as yet.
I have seen suggestions that it could be windows firewall related
(for a memory calculation?), but they make no mention as to what
might not be getting firewalled.
I saw a RedHat Cumin advisory that metioned the issue but ends
up linking to an errata that seems to deal with a "log broker
authentication" attack vector, so, again nothing to do with memory.
Furthermore, the links I have followed all seem to suggest that
no-one actually knows how that condition comes into the requirements
equation in the first place?
The odd thing for me is this, a not-quite working Condor 7.4.4
on a win 7 host, did allow me to run the same test and, as I
had copied back the job ad as part of the test, I can see that
this requirement was in there
RequestMemory = ceiling(ifThenElse(JobVMMemory =!= UNDEFINED, JobVMMemory,
ImageSize / 1024.000000))
Now I come to run the same submission to a 7.6.3, win 7 host and
I can't seemingly meet the requirements ?
The job submission file looks like this
-----8<-------------8<-------------8<-------------8<-------------8<--------
##########
universe = vanilla
environment = path=c:\WINDOWS\SYSTEM32
executable = pokearnd-win7.bat
TransferInputFiles =
arguments =
output = pokearnd.out.$(Cluster).$(Process)
error = pokearnd.err.$(Cluster).$(Process)
log = pokearnd.log.$(Cluster).$(Process)
Requirements = (OpSys == "WINNT61") && (Machine == "somemachine.vuw.ac.nz" )
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
queue 1
-----8<-------------8<-------------8<-------------8<-------------8<--------
The master, UNIX, is running condor-7.4.4
One more piece of info.
following another link I decided to read, I added a (Memory > 766) to the
Requirements in the above and, although the job is not assigned to the
targetted machine, a condor status sees three of the four slots on the
machine listed as "Matching"
slot1@somemachine WINNT61 INTEL Matched Idle 0.000 767
0+00:00:04
slot2@somemachine WINNT61 INTEL Matched Idle 0.000 767
0+00:00:05
slot3@somemachine WINNT61 INTEL Matched Idle 0.000 767
0+00:00:06
slot4@somemachine WINNT61 INTEL Unclaimed Idle 0.000 767
0+00:00:07
Most intersted to hear what folk who might know think is actually
going on here,
Kevin
--
Kevin M. Buckley Room: CO327
School of Engineering and Phone: +64 4 463 5971
Computer Science
Victoria University of Wellington
New Zealand