Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Running short-lived jobs on Condor

Date: Thu, 18 Jun 2015 18:23:03 +0000
From: "Rowe, Thomas" <rowet@xxxxxxxxxx>
Subject: Re: [HTCondor-users] Running short-lived jobs on Condor

The simulation replications I'm running can take anywhere between three days and thirty seconds. I have 80 slots on this network. Everything is great if runtimes are up towards a half hour. All slots are kept busy grinding away. But if I submit five hundred jobs that take 40 seconds each, I see at most about 30 slots put to use. The queue is actually often empty and all slots idle for one minute stretches except for the dagman job.

I played around with the NEGOTIATOR_INTERVAL setting, dropping it down to 20 seconds but that didn't seem to have too much impact.

What can I do to make it so that short running jobs don't result in a mostly idle cluster? There are many *_INTERVAL settings and it's not exactly obvious what knobs to turn. "HTCondor can't handle that case well" is a perfectly valid answer if that's the case.

Same question nine years ago without clear answers: https://lists.cs.wisc.edu/archive/htcondor-users/2006-September/msg00255.shtml

Follow-Ups:
- Re: [HTCondor-users] Running short-lived jobs on Condor
  - From: Brian Bockelman
- Re: [HTCondor-users] Running short-lived jobs on Condor
  - From: Dimitri Maziuk

Prev by Date: [HTCondor-users] Another jobs stuck in idle issue
Next by Date: Re: [HTCondor-users] Running short-lived jobs on Condor
Previous by thread: Re: [HTCondor-users] condor 8.3.5 -> failed to transfer files
Next by thread: Re: [HTCondor-users] Running short-lived jobs on Condor
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Running short-lived jobs on Condor