Re: [HTCondor-devel] cpu affinity and partitionable slots

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Date:	Mon, 12 Nov 2012 13:16:58 -0600
From:	Brian Bockelman <bbockelm@xxxxxxxxxxx>
Subject:	Re: [HTCondor-devel] cpu affinity and partitionable slots

On Nov 12, 2012, at 9:11 AM, Tim St Clair <tstclair@xxxxxxxxxx> wrote:

> of note: current cpu_shares (which only exists on master) uses SlotWeight, where I think it should really be TotalSlotCpus.
> 

Matt called me out on the above also; SlotWeight is probably more flexible, but possibly overloads the meaning of SlotWeight.  I'm a bit ambivalent and would be fine with changing.

> Open Questions:
> Does anyone have a good way of *really* testing cpu_shares? 
> What does over-subscription and fractions actually mean, or do we want to stick with whole numbers?  
> ++How does this^ affect performance?  
> 

I have no good way to *really* test them out (whatever that means), but we've used cpu_share for the last 6 months and have anecdotal evidence.

It has to be an integer value.  It seems to only matter to compare the relative shares within sibling cgroups.  For example, /condor/ has cpu_shares=1024 (the default), but /condor/<job ID> has cpu_shares=100 for each job.

We've had a few cases where someone would send a multicore job but only request 1 CPU.  In this case, we've verified:
1) If the system is busy, the multicore job gets only 1 core worth of CPU time (the amount allocated).
2) If cycles would otherwise go unused, the multicore job gets those.

So, it works as described.  That's the good news.

The bad news is that we have seen CPU-scheduler-related kernel panics on RHEL 6.3; while quite rare, I think they're cgroups-related.  Maybe one a week?

Brian

Attachment: smime.p7s
Description: S/MIME cryptographic signature

[← Prev in Thread]	Current Thread	[Next in Thread→]
[HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley Re: [HTCondor-devel] cpu affinity and partitionable slots, Brian Bockelman Re: [HTCondor-devel] cpu affinity and partitionable slots, Lans Carstensen Re: [HTCondor-devel] cpu affinity and partitionable slots, Matthew Farrellee Re: [HTCondor-devel] cpu affinity and partitionable slots, Greg Thain Re: [HTCondor-devel] cpu affinity and partitionable slots, Tim St Clair Re: [HTCondor-devel] cpu affinity and partitionable slots, Greg Thain Re: [HTCondor-devel] cpu affinity and partitionable slots, Brian Bockelman <= Re: [HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley

[← Prev in Thread]

Current Thread

[Next in Thread→]

Previous by Date:	Re: [HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley
Next by Date:	Re: [HTCondor-devel] cpu affinity and partitionable slots, Matthew Farrellee
Previous by Thread:	Re: [HTCondor-devel] cpu affinity and partitionable slots, Greg Thain
Next by Thread:	Re: [HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley
Indexes:	[Date] [Thread]

Previous by Date:

Re: [HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley

Next by Date:

Re: [HTCondor-devel] cpu affinity and partitionable slots, Matthew Farrellee

Previous by Thread:

Re: [HTCondor-devel] cpu affinity and partitionable slots, Greg Thain

Next by Thread:

Re: [HTCondor-devel] cpu affinity and partitionable slots, Dan Bradley

Indexes:

[Date] [Thread]

Mailing List Archives

Authenticated access

Re: [HTCondor-devel] cpu affinity and partitionable slots