HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Condor as a vm scheduler



On 01/12/2010 01:07 PM, Matt Hope wrote:
> Unfortunately that will not work well from an initially empty state
> on SMP machines. This is increasingly a problem with how the
> scheduler works (and the solutions can have a significant throughput
> impact as the dynamic partitioning on hefty boxes like Mag Gam's
> show)

The problem you raise is just when initially populating a pool? Anything that can aggregate slots together will work. As a bootstrap, you could assign a random number to each node and use it as a component of your RANK.


> Personally I think this is something that needs some serious thought
> and attention in the design of the scheduler.

It's pretty much embedding a specific mode of operation into the scheduler in my view. Something that the default scheduler tries to only do for very few things, fair-share being a notable example.

I say default scheduler, since I think Condor has a nice infrastructure in which different scheduling algorithms could be placed. The functionality to do that just isn't in place right now.


> How this works in a general sense is hard I know certain attributes
> can be updated on the fly, others (like LoadAvg) cannot but perhaps
> that means that those values which will prevent high throughput
> negotiation with consistent state should be marked as such and
> avoided (or if it's easier mark the ones that can be maintained).
> 
> If we moved to a dynamic model I may well decide to just use job
> hooks if they are stable enough and entirely bypass the scheduler
> since I can then control the performance characteristics and make use
> of internally calculated state rather than relying on state in the
> collector to 'catch up' with state that should be available (since it
> was decided just a moment ago in the same process).

It's great that you're at a point where you can fine tune for throughput so much. When I was last looking at the startd cron code I was wondering why it wasn't able to actively trigger an update. Would something like that get you closer to a collector with a more current view of the pool?


Forgive me. I forget how many machines you have in your pool(s)? A good portion are Windows boxes?


Best,


matt

> Matt
> 
> -----Original Message-----
> From: condor-devel-bounces@xxxxxxxxxxx [mailto:condor-devel-bounces@xxxxxxxxxxx] On Behalf Of Matthew Farrellee
> Sent: 12 January 2010 17:58
> To: Stanislav Ievlev
> Cc: condor-devel@xxxxxxxxxxx
> Subject: Re: [Condor-devel] Condor as a vm scheduler
> 
> On 01/12/2010 02:15 AM, Stanislav Ievlev wrote:
>> Greetings!
>>
>> With vm universe I can use Condor as a scheduler for virtual machines.
>>
>> Does the condor use a one generic scheduling algorithm for all types
>> of universe?
>>
>> For example, OpenNebula's scheduler try to pack the VMs in the cluster
>> nodes to reduce VM fragmentation
>> http://www.opennebula.org/doku.php?id=documentation:rel1.4:schg).
>>
>> Can I have a same functionality with a Condor?
>>
>> --
>> With best regards
>> Stanislav Ievlev.
> 
> Condor's scheduling algorithm is flexible enough to enable things like packing, without having to swap out the algorithm itself.
> 
> For instance, packing is just a preference to match against machines that already have running VMs. In Condor, you can do this with the RANK expressions.
> 
> http://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToSteerJobs
> 
> Best,
> 
> 
> matt
> _______________________________________________
> Condor-devel mailing list
> Condor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-devel
>