[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Setting default memory amount on jobs that are about to start



 

There are 2 parts to this question. I actually kind of solved the first part but looking for more info. The second part is what I am currently having trouble with. We are standing up a new cluster in our move from RHEL 7 to RHEL 8 and have been using condor 8.8 on the RHEL 7 and using Condor 25.4 on the RHEL 8 system.

 

The few users here that are/will be using the cluster have a history of not specifying a request_memory value in the condor submit script as we don’t necessarily know how much memory it needs until its running. If nothing is done then it seems the current default is 128 MB. I wanted to change this so that the default was roughly #RAM on the execute machine divided by the number of CPUs. I eventually came up with the following:

 

MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),6000})

 

Where the 6000 varied depending upon the machine (as we have machine with varying amounts of memory) and approximately represented the total memory of the machine divided by the number of cpu cores we were intending to use. For the new system I was going to do roughly the same thing but was looking for way so that I wouldn’t have to figure out each number. The following seemed to work:

 

MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),TotalMemory/$(NUM_CPUS)})

 

Then I had the bright idea that wanted to use some fraction of the total memory, say 95%. So I tried using 0.95*TotalMemory/$(NUM_CPUS) but that didn’t work. I tried a couple of variations such as 95*TotalMemory/(100*$(NUM_CPUS))thinking that there is some issue with floating point value they all did not work. Eventually I used the following which does seem to work:

 

MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),int(0.95*TotalMemory)/$(NUM_CPUS)})

 

So my first question is what is the format for using floating point calculations.

 

Now while this does what I mostly intended there is a draw back that I see. Let’s say a user is being diligent and specifies a memory amount that covers their jobs and this amount is smaller than the amount of the TotalMemory/#CPUs part. Then the amount allocated over several jobs would reduce the amount available for some other jobs that might need more. Seems like I need some type of IF _expression_ that could somehow determine if a Request_memory was actually specified and to use that if it was otherwise use the default amount. This MODIFY request, I think, is checked on the machine that job is about to run on and I’m not sure if IF expressions are consideration at this later point in time. Is there a way this can be accomplished?

 


This e-mail, including any attached files, may contain confidential and privileged information for the sole use of the intended recipient. Any review, use, distribution, or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive information for the intended recipient), please contact the sender by reply e-mail and delete all copies of this message.