Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DAG questions
- Date: Mon, 21 Dec 2009 11:43:12 -0600 (CST)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] DAG questions
On Fri, 18 Dec 2009, Ian Stokes-Rees wrote:
We're on 7.2.4 right now. We don't do upgrades on Fridays, but will upgrade
to 7.4 on Monday.
Thanks for the -output_dir and -debug pointers -- I read the DAG
documentation, but not the command details (man page) and missed seeing them.
I was expecting they'd be config file params.
Now I'm more confused about my situation. I've setup a much smaller run with
only 3500 nodes in the DAG, however I'm still getting PANIC messages due to
lack of file descriptors. An identical submission with only 40 nodes works
fine, so I feel that rules out my general configuration, and points to either
an OS issue or a Condor config issue. I've completely stopped all condor
processes and restarted them.
Unless your DAG is really "wide" (most of the 3500 nodes in the queue at
one time) upgrading to 7.4 should fix your file descriptors issue. The
DAGMan log file reading code underwent some pretty major changes between
7.2 and 7.4: now there's only an open file descriptor for each log
file corresponding to a job that's actually in the queue; before, DAGMan
opened all of the log files at the start, and kept them open.
(So for anyone else who runs into this problem and can't upgrade to 7.4,
the answer is to make your node jobs use a smaller set of log files, as
opposed to having a separate log file for each job.)
If you're running a 7.4 DAGMan, a new feature is that you don't have to
specify a log file at all in your submit file -- if you don't, DAGMan
will assign a default log file for you. In fact, this may be the
preferred way to do things, especially if you want to re-use your submit
files in more than one DAG. The default log files are per-DAG, so if you
use the same submit file in two different DAGs you won't have to worry
about log file collisions if you use the default log file feature.
Kent Wenger
Condor Team