> > This subject has been previously visited (see subject: 'DAGMAN slow > > startup'), but I was hoping somebody might have some more insight. I > > submit dependent jobs via the condor DAG submit, and I'm finding that > > there is a delay between when the condor_dagman starts running and > > submits the first job in my DAG and when that job actually gets farmed > > out to one of the machines in my network. The delay is actually > > significant. Anywhere between 2 to 5 minutes. On the odd occasion, it > > will start up almost immediately, so I'm assuming its related to waiting > > for a reschedule event or something and is kind of luck of the draw. > > > > When I submit any of these jobs with a plain ol' condor_submit, it > > finds a dance partner pretty quickly and starts running. It seems to > > only be when dagman submits a job. I don't know the underlying logic > > behind these calls, so I don't know if that makes any sense to those of > > you who are developing for Condor. > > This is kind of strange. There's really no significant difference between > DAGMan submitting a job versus manually running condor_submit (DAGMan > actually runs condor_submit to submit each job). Especially if you are > actually seeing the jobs in the queue, but they are not running, it seems > unlikely that DAGMan itself has much to do with this. I wonder if the > problem has something to do with the *pattern* of submits when you run > DAGMan as opposed to submitting jobs manually. I'm not a real expert on > the negotiation cycle, but that's kind of an initial guess. > > Kent Wenger > Condor Team Thanks for the response Kent. I'm hoping there's some sort of difference between doing a condor_submit and dagman doing a submit :|... there's gotta be! :) I'm curious if other people experience this delay or if its just me. I can take the easiest Hello World jobs, and condor_submit them where it takes seconds to turn around and run and finish every time, then take that job and create a one-line DAG file (with one line in it: "job 1 C:\condor\jobs\helloworld.cmd") and submit_dag that and have it take anywhere from 5 to 10 minutes to complete :|. If its just me then there's gotta be some configuration that I'm not using (or currently abusing). Offhand, is there a simple way to speed up negotiation? Maybe a way to force more rescheduling to happen more often? *** a little more information: it seems that the job sits with open machines under the status: 'match but reject the job for unknown reasons'. Doing a condor_reschedule always unbungles it. I guess that's just the bucket when there's no other valid status for a given job? *** Thanks, Steve Upgrade to Hotmail Plus and share more photos with bigger attachments. Click here to find out how Click here to find out how |