Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] How to find out CPU affinity from schedd plugin
- Date: Tue, 14 Feb 2023 09:21:48 +0100
- From: Joachim Meyer <jmeyer@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] How to find out CPU affinity from schedd plugin
Hi Greg,
ups, completely forgot to mention, that we looked into ASSIGN_CPU_AFFINITY.
My understanding is, that, given one partitionable slot of 100% of the CPU's
threads, the threads will be pinned to the first N free cores in increasing ID
order:
https://github.com/htcondor/htcondor/blob/
8bcd2442756564ebbfdb6955fb16483806fda236/src/condor_startd.V6/ResMgr.cpp#L1777
We could maybe model this in the schedd but that sounds fishy (esp when there's
two schedds). Instead, it'd be ideal if the schedd would get the list of
assigned CPU cores reported by the startd upon starting a job.
Now, re why we're considering pinning jobs to cores:
1. the pressing one is monitoring: we want to be able to provide performance
counter monitoring per job, i.e. as far as possible only report performance
counter values (e.g. FLOPs/s), for the job in question and not report the
whole system's FLOPs/s, core frequency, user CPU usage, ...
ClusterCockpit just needs to know which CPU threads were assigned to a job to
filter this. Other metrics, such as memory bandwidth can only be measured on a
CPU socket basis, but again the relevant socket(s) are identified via the
assigned CPU threads.
I assume that at least user CPU usage could (and already is by HTCondor)
easily be measured on a per process level, but afaict, the actual performance
counter metrics such as FLOPs/memory bandwidth cannot be filtered by process
but only by CPU threads.
2. But there's another reason except "cleaner job-specific monitoring": while
pinning CPU cores might lead to lower overall CPU usage, due to not being able
to use excess cores when the node's utilization allows it, and probably users
having to overestimate their CPU usage: currently we regularly have users that
greatly underestimate their job's CPU resource demands. So they will only
request 5 CPUs but when the node is otherwise idle, a multiple of the
requested CPUs are actually utilized. That's fine in theory, because that means
the node will actually be fully utilized if there's multiple jobs scheduled to
it. But on the other hand that leads to confusion as well: users are
unsatisfied that their job's performance drops when multiple (of their) jobs
are scheduled to the same machine as now multiple jobs that use more cores
than "requested" lead to the cgroup limiting to start kicking in.
Long story short: we find CPU pinning attractive as well, as users will get
much more predictable performance independent on the cluster's occupancy.
As our jobs usually are rather GPU-bound, we're also not as concerned with
getting the last bit of utilization out of the CPUs.
So, from our point of view, we're quite okay (and actually see benefits) with
pinning jobs to cores in theory.
Our current blocker is, to make this useful for monitoring, we'd need to know
which CPUs a job was pinned to.
In the worst case.. could we utilize a startd plugin (or similar) to access
the affinity mask?
Thanks for any insights!
Best,
- Joachim
P.S.:
In the future, we'll probably need to consider NUMA aware pinning, since it
might be very much hurting the performance if we indeed pin a job to CPU cores
of different NUMA nodes (if we still have enough free cores to schedule them to
a single one). The performance penalty might also be quite notable, if we pin
our jobs to a socket and assign a GPU to the job that is connected to the
other socket.
Am Montag, 13. Februar 2023, 19:24:48 CET schrieb Greg Thain via HTCondor-
users:
> On 2/13/23 07:46, Joachim Meyer wrote:
> > Hi,
> >
> >
> > Is there a way to assign CPU cores and get the information which cores are
> > assigned from within a condor_schedd plugin? So far it seems to me that
> > the
> > assigned CPU cores are not reported back to the schedd in some classad?
>
> Hi Joachim:
>
> There are controls in HTCondor to affinity-lock jobs to cpu cores -- see
> the ASSIGN_CPU_AFFINITY setting in
> https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.
> html?highlight=ASSIGN_CPU_AFFINITY#condor-starter-configuration-file-entries
>
> But, there is a reason this is not on by default. While locking jobs to
> specific cores may sometimes improve performance for that job, it can
> also lower the overall throughput of the system.
>
> What are the specific performance counts per job that you are interested
> in? I wonder if there is a better way to more directly capture these?
>
> -greg
>
> > We're using HTCondor 9.12 right now.
> >
> > Thanks for any pointers!
> > - Joachim
> >
> >
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> > a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/