On Monday, 21 May, 2012 at 6:36 AM, Ian Cottam wrote:
On 18/05/2012 20:04, "Dan Bradley" <dan@xxxxxxxxxxxx> wrote:On 5/18/12 12:19 PM, Ian Cottam wrote:We are thinking about updating to 7.8.0.I noticed that there is, with 7.8, a defrag daemon for dynamic slots.On our (main) pool we have preemption off anyway: am I right in thinkingthat this defrag then is not for us?Defragmentation is desirable when jobs requiring large slots (e.g. manycores or big memory) suffer from starvation (rarely or never gettingscheduled to run) due to fragmented machines. Machines becomefragmented when they are partitioned into small slots to fit smalljobs. If many small jobs are running on a machine at the same time, thechance is small that they will all exit at the same time, freeing up alarge chunk of resources for large jobs to use. The Condor negotiator'sresource allocation algorithm currently just works with the slots thatexist. It does not make reservations or preempt multiple slots, so somemethod of defragmenting machines is needed to avoid the problem ofstarvation of large jobs.Defragmentation can cause jobs to be killed. If you do not want that,MaxJobRetirementTime can be used to specify how long jobs should beallowed to run on machines that are being drained.I only ask because sometimes (with 7.4/7.6) and dynamic slots we seepartial matches that don't go through and wondered if there wassomethingin 7.8 that helps with this.If by "partial matches that don't go through" you mean the starvationproblem I mentioned above, then condor_degrag can help. If it is someother problem, then it may or may not.--DanWhat we have is jobs that Match but never start.We have just demonstrated that if we move the Memory requirement from theRequirements line to a Request_memory=n line, they work.We are not entirely sure why.-Ian