Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DAGMAN delay between submit of job and scheduling of that job
- Date: Wed, 1 Oct 2008 14:08:13 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] DAGMAN delay between submit of job and scheduling of that job
On Wed, 1 Oct 2008, Steve Shaw wrote:
This subject has been previously visited (see subject: 'DAGMAN slow
startup'), but I was hoping somebody might have some more insight. I
submit dependent jobs via the condor DAG submit, and I'm finding that
there is a delay between when the condor_dagman starts running and
submits the first job in my DAG and when that job actually gets farmed
out to one of the machines in my network. The delay is actually
significant. Anywhere between 2 to 5 minutes. On the odd occasion, it
will start up almost immediately, so I'm assuming its related to waiting
for a reschedule event or something and is kind of luck of the draw.
When I submit any of these jobs with a plain ol' condor_submit, it
finds a dance partner pretty quickly and starts running. It seems to
only be when dagman submits a job. I don't know the underlying logic
behind these calls, so I don't know if that makes any sense to those of
you who are developing for Condor.
This is kind of strange. There's really no significant difference between
DAGMan submitting a job versus manually running condor_submit (DAGMan
actually runs condor_submit to submit each job). Especially if you are
actually seeing the jobs in the queue, but they are not running, it seems
unlikely that DAGMan itself has much to do with this. I wonder if the
problem has something to do with the *pattern* of submits when you run
DAGMan as opposed to submitting jobs manually. I'm not a real expert on
the negotiation cycle, but that's kind of an initial guess.
Kent Wenger
Condor Team