Hello Christoph: Thank you for the time that you have taken to consider my problem and for your valuable comments. The approach that you mentioned (via the submission file using classAdds or other mechanism), is exactly what I wanted to achieve. What I do not know how to do is what you mention about creating one partitionable slot on the worker node(s). I do not know what a partitionable slot is, nor
what it does, but I would go through the condorâs manual. I assume that such definition should be done by adding directives within the condor_config file of the given worker. Could you please provide an example that
I could use as a starting point? Certainly, I appreciate your valuable comments and please, feel free to make any comments and suggestions. Regards, jjv Julio J. ValdÃs National Research Council Canada | Conseil National de Recherches Canada
Digital Technologies Research Centre | Centre de Recherche en Technologies NumÃriques Data Science for Complex Systems Group | Science des DonnÃes pour les SystÃmes Complexes M-50, 1200 Montreal Road, Ottawa, Ontario K1A 0R6 | M-50, 1200 chemin MontrÃal, Ottawa, Ontario K1A 0R6
Canada | Canada
tel/tÃl: (1)613-993-0257 From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Beyer, Christoph ***ATTENTION*** This email originated from outside of the NRC. ***ATTENTION*** Ce courriel provient de l'extÃrieur du CNRC Hi Julio, I think you are a bit overthinking here ;)
Condor can take care of mostof the issues you describe through classadd matching. You just need to define the needs of the job in the submitfile in terms of 'request<ressource>' e.g.
memory, disk, cpu. On the workernode you can create one partitionable slot and condor will create childslots for each job following the needs of the job. At the same time condor keeps track automatically
of the ressources on the workernodes and e.g. once all the memory you gave to the partitionable slot is reserved by claimed slots no additional jobs will start on that worker. I maybe wrong here though (and might have misunderstood you) - my wife assures I often am ;)
Best christoph
Von:
"Valdes, Julio" <Julio.Valdes@xxxxxxxxxxxxxx> Hello Greg: The machines where the jobs would run have memory in the 40-96 Gb range and considering the minimal size, each machine could run simultaneously with 4 jobs (I
have done these tests manually, so I know it for sure). That is one of the reasons why I want to control the number of jobs assigned to each machine. If more than 4 jobs are simultaneously allocated, then those machines with the minimum amount of memory would
be maxed out. The other reason is that if all cores of a given machine are running jobs, the machines would be incapable of doing other tasks, or they would do it at a very
low speed. Disk space is not an issue at all, as they have large disks, more than sufficient to store the files generated by the jobs. An additional question that I have is whether it is possible, when submitting a job to a given machine, to specify the slot that should run the job. I do not
know if condor allows that kind of control and if it does, how to write the submission file to achieve such behavior. Thank you for considering my problem and do not hesitate in asking any question to help you understand the situation. I appreciate very much that you have taken time to considering the problem. Sincerely Julio J. ValdÃs National Research Council Canada | Conseil National de Recherches Canada
Digital Technologies Research Centre | Centre de Recherche en Technologies NumÃriques Data Science for Complex Systems Group | Science des DonnÃes pour les SystÃmes Complexes M-50, 1200 Montreal Road, Ottawa, Ontario K1A 0R6 | M-50, 1200 chemin MontrÃal, Ottawa, Ontario K1A 0R6
Canada | Canada
tel/tÃl: (1)613-993-0257 From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Greg Thain ***ATTENTION*** This email originated from outside of the NRC. ***ATTENTION*** Ce courriel provient de l'extÃrieur du CNRC On 9/1/21 2:20 PM, Valdes, Julio wrote:
Can we get a few more details about your requirements? e.g. Do you want only (I assume "at most") 4 jobs from any user of any kind of job to ever run at the same time on machine A? What are the cpu & memory requirements for these
jobs? -greg
|