[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Setting default memory amount on jobs that are about to start



I believe I am trying to do this on the EP (execution point?) side because our pool consists of machines with varying amounts of memory (16 GB - 1 TB) and I'm just trying to make the default (wherever the job ends up) be somewhat similar to have all the slots be divided up evenly (as was done before partitionable slots were the default).  However, I definitely want the flexibility of partitionable slots.  Obviously the process works better if people would specify a memory request in the condor submit files. (So I would be curious as to how this can be checked when the job is submitted.) Are you saying that there is a way, on the submit side, that  after submission and during negotiation that there is a way to calculate a default amount of memory for each machine it is being considered to execute on?

Douglas

-----Original Message-----

----------------------------------------------------------------------

Message: 1
Date: Fri, 6 Feb 2026 14:48:04 +0000
From: Cole Bollig <cabollig@xxxxxxxx>
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Setting default memory amount on jobs
	that	are	about to start
Message-ID:
	<CO6PR06MB7266446962AEE31E9D62868EAF66A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
	
Content-Type: text/plain; charset="windows-1252"

Hi Douglas,

ClassAd expression arithmetic should be similar to that in the C programing language. That being said a have a few points/questions:

  1.
Is this behavior something you want on the EP side of operation and not the AP side? The AP has mechanisms for such jobs attempted to be placed to the Schedd warn or fail if no request attribute(s) were set by the user and the ability to set the default resource request values to whatever value or ClassAd expression. The CHTC local pool used to do the former but know does the latter. I bring this up because solving this on the EP side manipulation has no effect on negotiation which can lead to jobs matching with 128 MB of memory but failing to create the dynamic slot due to lack of available memory. If this was done on the AP side then the system could more accurately match jobs to EPs with actual available resources.
  2.
There is a classad_eval tool to hand test expressions that could help with testing your expressions more quickly. Note that your current expression is referring to a configuration option, but you can do clever things. Here is an example command I executed based on your current expression:
     *
classad_eval '[ TotalMemory = 10000; RequestMemory = 128; ]' "max({quantize(RequestMemory,128),int(0.95*TotalMemory)/$(condor_config_val NUM_CPUS)})"

Note if you truly want to do the control/manipulation on the EP side then I have some more points, but I would argue doing something in the AP control.

Cheers,
Cole Bollig
________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Vechinski, Douglas <https://urldefense.proofpoint.com/v2/url?u=http-3A__douglas.vechinski-40dr-2Dinc.com&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=kyKPok26Kd7TnZ9wDno4F8f0UHhP9s_ZlU1USEnXfC4&m=MqdC5U5QNUPeYplkXEnVvSVxtjYDz_pKqZzivXfv_HzUFq-QaxXUQfboVNu5ZfW3&s=LKZdriBanoS8gAtm85679bd0tmZEU_YrxmlFvE9VJ58&e=>
Sent: Thursday, February 5, 2026 11:29 AM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Setting default memory amount on jobs that are about to start




There are 2 parts to this question. I actually kind of solved the first part but looking for more info. The second part is what I am currently having trouble with. We are standing up a new cluster in our move from RHEL 7 to RHEL 8 and have been using condor 8.8 on the RHEL 7 and using Condor 25.4 on the RHEL 8 system.



The few users here that are/will be using the cluster have a history of not specifying a request_memory value in the condor submit script as we don?t necessarily know how much memory it needs until its running. If nothing is done then it seems the current default is 128 MB. I wanted to change this so that the default was roughly #RAM on the execute machine divided by the number of CPUs. I eventually came up with the following:



MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),6000})



Where the 6000 varied depending upon the machine (as we have machine with varying amounts of memory) and approximately represented the total memory of the machine divided by the number of cpu cores we were intending to use. For the new system I was going to do roughly the same thing but was looking for way so that I wouldn?t have to figure out each number. The following seemed to work:



MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),TotalMemory/$(NUM_CPUS)})



Then I had the bright idea that wanted to use some fraction of the total memory, say 95%. So I tried using 0.95*TotalMemory/$(NUM_CPUS) but that didn?t work. I tried a couple of variations such as 95*TotalMemory/(100*$(NUM_CPUS))thinking that there is some issue with floating point value they all did not work. Eventually I used the following which does seem to work:



MODIFY_REQUEST_EXPR_REQUESTMEMORY=max({quantize(RequestMemory,128),int(0.95*TotalMemory)/$(NUM_CPUS)})



So my first question is what is the format for using floating point calculations.



Now while this does what I mostly intended there is a draw back that I see. Let?s say a user is being diligent and specifies a memory amount that covers their jobs and this amount is smaller than the amount of the TotalMemory/#CPUs part. Then the amount allocated over several jobs would reduce the amount available for some other jobs that might need more. Seems like I need some type of IF expression that could somehow determine if a Request_memory was actually specified and to use that if it was otherwise use the default amount. This MODIFY request, I think, is checked on the machine that job is about to run on and I?m not sure if IF expressions are consideration at this later point in time. Is there a way this can be accomplished?



This e-mail, including any attached files, may contain confidential and privileged information for the sole use of the intended recipient. Any review, use, distribution, or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive information for the intended recipient), please contact the sender by reply e-mail and delete all copies of this message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://urldefense.proofpoint.com/v2/url?u=https-3A__www-2Dauth.cs.wisc.edu_lists_htcondor-2Dusers_attachments_20260206_e7e685b4_attachment.html&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=kyKPok26Kd7TnZ9wDno4F8f0UHhP9s_ZlU1USEnXfC4&m=MqdC5U5QNUPeYplkXEnVvSVxtjYDz_pKqZzivXfv_HzUFq-QaxXUQfboVNu5ZfW3&s=4h53h7jhf6cuOH1f_acOm9PpzdDG1yHmf2nHQB_rIr8&e=>

------------------------------

Subject: Digest Footer

_______________________________________________
HTCondor-users mailing list
HTCondor-users@xxxxxxxxxxx

Join us in June at Throughput Computing 25: https://urldefense.proofpoint.com/v2/url?u=https-3A__osg-2Dhtc.org_htc25&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=kyKPok26Kd7TnZ9wDno4F8f0UHhP9s_ZlU1USEnXfC4&m=MqdC5U5QNUPeYplkXEnVvSVxtjYDz_pKqZzivXfv_HzUFq-QaxXUQfboVNu5ZfW3&s=mXubAH7EerzWoHr--swijjx7e5VMzv7opWuvnLVaHMs&e=

The archives can be found at: https://urldefense.proofpoint.com/v2/url?u=https-3A__www-2Dauth.cs.wisc.edu_lists_htcondor-2Dusers_&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=kyKPok26Kd7TnZ9wDno4F8f0UHhP9s_ZlU1USEnXfC4&m=MqdC5U5QNUPeYplkXEnVvSVxtjYDz_pKqZzivXfv_HzUFq-QaxXUQfboVNu5ZfW3&s=qZrgkyVaXzK5r8f7638XNr-zTeGQcmfvEgS41xHYnyY&e=


------------------------------

End of HTCondor-users Digest, Vol 147, Issue 9
**********************************************