Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Change JobUniverse from vanilla to local?
- Date: Thu, 21 Sep 2017 09:27:55 +0200
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] Change JobUniverse from vanilla to local?
Hi Todd,
On Wed, 2017-09-20 at 12:56:41 -0500, Todd Tannenbaum wrote:
> On 9/19/2017 2:25 AM, Steffen Grunewald wrote:
> >Hi John,
> >
> >I understand - design decisions. So rewriting the DAG is the only way out of this misery...
> >
> >Thanks,
> > Steffen
> >
>
> Some other brain storm ideas -
Everything is welcome!
> 1. How about running a condor_startd on your dagman/submit machine, and the
That's what I actually did.
> START expression would be something like "only run jobs submitted by DAGMan
> that have been idle for over X amount of time" ?
Such a criterion would easily match thousands (literally) of jobs, given our
dag structure. Adding a requestmemory lower limit may help. I decided to
check for this specific user in the START expression temporarily.
> 2. Perhaps you could use the condor_jobrouter to transform a vanilla job
> into a local universe job?
Never heard of that before. When was it added? It doesn't seem to be in
my "handbook" yet...
> I personally like option #1 better, since then the job remains vanilla
> universe, and the management of jobs is better.
Agreed. We're planning to add memory to a few machines to resolve this issue.
This will need adjustment of START expressions, of course - or better preemption
than we have now.
> Hope the above helps
Me too :) Thanks, Steffen