[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Help needed understanding cpu core usage with cgroups



Hi

I've got cgroups configured on RedHat 6.6 (as per section 3.12.12 of the condor manual) and HTCondor (8.2.8) configured to use it and I'm now trying to understand how the number of cpu cores assigned to condor jobs is limited by cgroups.

In particular I'm trying to verify Option 2 at https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToLimitCpuUsage.

My test system has 4 cores (no hyperthreaded cores) with 2 cpus allocated to condor using DETECTED_CORES=2 and a partitionable slot with
SLOT_TYPE_1= cpus=100%, ram=100%

The machine classad shows:
TotalCpus = 2.0
TotalSlotCpus = 2
Cpus = 2

I have a condor job with 5 threads (4 cpu bound) running with request_cpus = 2 in the submit file.

When I have 2 foreground (Owner) jobs running at 100%cpu the condor job is only getting the equivalent of 1 cpu between its threads.

I'm measuring this by looking at the aggregate nice cpu percentage which is 25% in the output of the top program (the condor jobs are niced to 16 while the foreground jobs running at nice 0). This result is confirmed by the sum of the cpu percentage of the condor job threads adding up to approx 100% indicating that only one core is being used.

From the wiki page above, I was expecting that the condor job would access 2 cpus rather than 1 under these circumstances. Did I misunderstand something here?

One point that I'm not sure about is the first paragraph in Option 2. HTCondor is started as root (from init scripts; condor is installed form the condor repository rpm) but running as the condor user. Does that count as "condor daemons being started as root"?

There is only one condor_startd.

Thanks

Roderick Johnstone