On Monday, 10 December, 2012 at 12:59 PM, Dimitri Maziuk wrote:
On 12/09/2012 11:36 PM, John Wong wrote:
That is a different story I think. I'd love to see a node-level data
placement mechanism in condor, or at least the ability to evaluate ` [
-f /var/tmp/mydatabase ] ` at job submission time, but I don't believe
you can.
Perhaps I'm misunderstanding what you're after here but why don't you
have this now?
Job A runs on Machine A and bring along a subset of your massive dataset
in to some place like /tmp/cache. Before the job exits it leaves a small
bit of Condor configuration in the ~condor/config directory, let's call
it cache_contents.config, and the file simple says:
MyCacheContents = "subsetXYZ123"
STARTD_ATTRS = $(STARTD_ATTRS), MyCacheContents
And it advertises the cache contents to the world by running:
condor_reconfig -full
before it finally exits.
Now the ClassAd for the machine contains the attribute:
MyCacheContents = "subsetXYZ123"
And jobs can steer based on this string by putting:
rank = MyCacheContents =!= Undefined * MyCacheContents == "subsetXYZ123"
* 1000
In their submission files. If the machine has the subset of the data
cached already, the job will rank it higher than any other machine and
prefer to run their first.
Adjust to suit your tastes for preemption and what not.
Simple but effective. You can make the identifying string for the cache
contents encode some additional information if you want to do some sort
of fuzzier logic around steering jobs to machines rather than just
simple string matching.
Regards,
- Ian