Hi Tom, many thanks for the confirmation :)tbh, we want probably eat the cake and keep it, i.e., having a somewhat hard limit but being lenient towards our users... Probably we will play a bit with the cgroups' values and see how the system evolves.
Cheers and thanks, Thomas On 16/06/2021 17.44, tpdownes@xxxxxxxxx wrote:
Thomas:You understand the cpu shares mechanism correctly. It's a soft limit with a policy for resolving conflict when conflict arises.If you really want to nail down HTCondor jobs to a total number of cores, you want to want to use cpu.cfs_quota_us (and optionally cpu.cfs_period_us) on the parent htcondor cgroup. This is an honest to goodness hard limit on CPU usage that works in parallel with the shares mechanism.https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-cpu <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-cpu>Short version, to assignÂ1-core to the cgroup, set the quota to 1000000.Within the htcondor cgroup, shares will be enforced by HTCondor but the overall limit will be applied at the parent level.TomOn Wed, Jun 16, 2021 at 10:20 AM Thomas Hartmann <thomas.hartmann@xxxxxxx <mailto:thomas.hartmann@xxxxxxx>> wrote:Hi all, a short question regarding jobs core time scaling via cgroup cpu.shares: The relative share of a job's cgroup is only limiting with respect to the total core-scaled CPU time, or? I.e., we are running our nodes with hyperthreading 2x enabled for simplicity, since we use the same machines for production jobs as well as for user job sub-clusters. Since user have occasionally odd user jobs (that tend to work better without overbooking) we broker on user nodes only 1/2 of the HT-core numbers for jobs. now, the condor parent cgroup has assigned  Âhtcondor/cpu.shares = 1024 with respect to the total system share of  Âcpu.shares=1024 so all condor child processes (without further sub-groups) could in principle use up to 100% of the total HT-core scaled CPU time. A single core job gets a relative share like htcondor/condor_var_lib_condor_execute_slot2_15@xxxxxxxxxxxxxxx/cpu.shares <http://condor_var_lib_condor_execute_slot2_15@xxxxxxxxxxxxxxx/cpu.shares> 100 where we broker only 50% of the total HT-core scaled time - as far as I see. However, user jobs can utilize more than their nominally assigned cpu share. My understanding is, that the kernel notices, that the total CPU time is not utilized completely - and thus allows processes to use more than their nominal time limit as there is still CPU time available. Is this correct? ð When we scale the condor parent cgroup to a reasonable fraction of the system cpu.share (taking HT efficiency into account), we should be able to scale CPU times per job to (roughly) core-equivalents - without the need to bind jobs to specific cores, or? Cheers,  ÂThomas _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx <mailto:htcondor-users-request@xxxxxxxxxxx> with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users <https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users> The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/ <https://lists.cs.wisc.edu/archive/htcondor-users/> _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature