Re: [HTCondor-devel] resource allocation question and proposal


Date: Fri, 10 May 2013 09:41:38 -0400 (EDT)
From: Tim St Clair <tstclair@xxxxxxxxxx>
Subject: Re: [HTCondor-devel] resource allocation question and proposal
inline 

----- Original Message -----
> From: "Douglas Thain" <dthain@xxxxxx>
> To: htcondor-devel@xxxxxxxxxxx
> Sent: Friday, May 10, 2013 8:30:42 AM
> Subject: [HTCondor-devel] resource allocation question and proposal
> 
> (Great to see everyone again at HTCondorWeek this year.)
> 
> We are encountering a growing number of situations where we need to
> communicate allocation of resources down a tree of processes on the
> same machine.  Unless told otherwise, most programs simply look at the
> number
> of cores/memory/disk installed on a machine, and then attempt
> to use everything simultaneously.  Obviously, this doesn't work with N>1
> 
> As an example, we use Condor to deploy a Work Queue as a pilot job
> system in order to run some multi-core jobs.  The machine may have 16
> cores, of which Condor gives us 8 in a slot, on which Work Queue may
> want to run two x 4 core jobs simultaneously.  We can set this all up
> manually, but it would be better to simply communicate the resource
> allocation down the chain.
> 
> So, first, a question:
> 
> - Does HTCondor communicate the properties of a slot to the job
> running in that slot?  e.g. You have been assigned 2 cores and 1GB
> RAM, so please behave.

I don't think it is in the env, but it's certainly in the slot ad, and enforceable via cgroups. 

> 
> If not, then a modest proposal:
> 
> - Could we define a simple and common way of communicating intended
> resource allocations from parent to child process?  It might be as
> simple as defining a few environment variables: CORES=4; MEMORY=8;
> DISK=16

+1 seems easy, want to make a ticket ;-) 

> 
> I am not concerned about enforcement (yet) but just simply
> communicating the expected behavior to a child process.  If we could
> document a common way of doing this that even a few projects could
> sign on to, it would help with these sort of problems immensely.
> 
> P.S. Yes, yes, I know about VMs/cgroups/etc but they are not
> universally deployed and don't compose hierarchically.
> 
> - Doug
> _______________________________________________
> HTCondor-devel mailing list
> HTCondor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel
> 
[← Prev in Thread] Current Thread [Next in Thread→]