On Mon, 2021-10-11 at 19:15 +0200, luis.fernandez.alvarez@xxxxxxx wrote:
Hello Valerio,
I am in the process of setting up a
configuration similar to what you are looking for.
Just to confirm we're talking about the
same scenario, this is what I plan to deploy:
- One partitionable slot per NUMA node ( in our machines with
2 NUMA nodes, I will set resources to 50%).
- Each partitionable slot can define its own base cgroup, so I
will define htcondor_numa0, htcondor_numa1 and assign it to
each slot.
Then, how to handle the numa nodes? In my case I am running
CentOS 7, and they encourage using systemd. It doesn't offer all
the options so it's going to be a bit hacky:
- Create a couple of root systemd slices (htcondor_numa0,
htcondor_numa1).
- Create a companion service bound to the slice
(htcondor_numa0-config.service).
- This service is in charge of setting the cpusets & mems
to the cgroup.
- Finally, I add an extra dependency in systemd to ensure that
the slices are enabled and run before htcondor service.
Once I have this in place (I am planning to test this week), I
can provide and shared the final config define in our cluster.
Cheers,
Luis
Hi Luis, did you manage to test this setup yet? and if so, can you tell me how it worked?
Thanks Valerio
This is an example of how to use cpusets with shell:
mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset cd /sys/fs/cgroup/cpuset mkdir Charlie cd Charlie /bin/echo 2-3 > cpuset.cpus /bin/echo 1 > cpuset.mems /bin/echo $$ > tasks sh # The subshell 'sh' is now running in cpuset Charlie # The next line should display '/Charlie' cat /proc/self/cpuset
I need an example of how to use cpusets for HTCondor slots.
Thanks.
Valerio
Greg almost certainly wrote the thing. I must
have seen (and searches for) the old ENFORCE_CPU_AFFINITY and
SLOT settings Iâve used in past.
That said, I only see that ASSIGN_CPU_AFFINITY
claims to pin jobs to cores, not that it will do so in a
NUMA-aware way.
Tom
I think it's worth adding to Greg's
response that I don't believe ASSIGN_CPU_AFFINITY
will, all on its own, map jobs to NUMA nodes. I
believe your best bet here is to create N
partitionable slots, with N equal to the number of
physical CPUs you have. Then combine
ASSIGN_CPU_AFFINITY and SLOT<N>_CPU_AFFINITY
so that each slot is mapped to NUMA nodes.
If you want to polish the doorknob, you should
also look into setting cgroup cpusets on HTCondor
so that it has "exclusive" access where it can and
that the other top-level cgroups (e.g.
system.slice) do not have access to very many
cores. There obviously has to be some overlap
unless you're willing to reduce the # of cores
available to HTCondor.
PS: you should consider reducing the "cpuquota"
available to HTCondor in its cgroup. It's always
good to have 0.25 core available for ssh!
Tom
The 9.2.0 manual says that ASSIGN_CPU_AFFINITY
replaces both ENFORCE_CPU_AFFINITY and
SLOT<N>_CPU_AFFINITY.
Is SLOT<N>_CPU_AFFINITY still a valid
configuration variable?
Thanks,
Valerio
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
|