Date: | Fri, 10 May 2013 09:50:53 -0500 |
---|---|
From: | "Todd Tannenbaum" <tannenba@xxxxxxxxxxx> |
Subject: | Re: [HTCondor-devel] resource allocation question and proposal |
Hi Doug, Recall that the starter does write out the claimed machine ad as ascii text into a file - the location of this file is inserted as an environment variable to the job. Of course you can also explicitly pass cpu/memory or anything else from the machine ad via environment via use of $$() in the submit file. Finally, at least for cpu cores and memory, the latest dev release of htcondor has nice mechanisms for enforcing the limits via Linux kernel cgroups and affinity support. -- Sent from my HP Veer mobile phone On May 10, 2013 8:33 AM, Douglas Thain <dthain@xxxxxx> wrote: (Great to see everyone again at HTCondorWeek this year.) We are encountering a growing number of situations where we need to communicate allocation of resources down a tree of processes on the same machine. Unless told otherwise, most programs simply look at the number of cores/memory/disk installed on a machine, and then attempt to use everything simultaneously. Obviously, this doesn't work with N>1 As an example, we use Condor to deploy a Work Queue as a pilot job system in order to run some multi-core jobs. The machine may have 16 cores, of which Condor gives us 8 in a slot, on which Work Queue may want to run two x 4 core jobs simultaneously. We can set this all up manually, but it would be better to simply communicate the resource allocation down the chain. So, first, a question: - Does HTCondor communicate the properties of a slot to the job running in that slot? e.g. You have been assigned 2 cores and 1GB RAM, so please behave. If not, then a modest proposal: - Could we define a simple and common way of communicating intended resource allocations from parent to child process? It might be as simple as defining a few environment variables: CORES=4; MEMORY=8; DISK=16 I am not concerned about enforcement (yet) but just simply communicating the expected behavior to a child process. If we could document a common way of doing this that even a few projects could sign on to, it would help with these sort of problems immensely. P.S. Yes, yes, I know about VMs/cgroups/etc but they are not universally deployed and don't compose hierarchically. - Doug _______________________________________________ HTCondor-devel mailing list HTCondor-devel@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | Re: [HTCondor-devel] resource allocation question and proposal, Igor Sfiligoi |
---|---|
Next by Date: | Re: [HTCondor-devel] resource allocation question and proposal, Douglas Thain |
Previous by Thread: | Re: [HTCondor-devel] resource allocation question and proposal, Igor Sfiligoi |
Next by Thread: | Re: [HTCondor-devel] resource allocation question and proposal, Douglas Thain |
Indexes: | [Date] [Thread] |