Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Stuck dagman jobs after restart
- Date: Mon, 15 Dec 2014 09:50:01 -0600 (CST)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Stuck dagman jobs after restart
On Mon, 15 Dec 2014, Brian Bockelman wrote:
Hi Brian,
It might be worth it to look at the UserLog of these jobs - it's
possible they are switching quickly between R and I?
Hmm, you could look, but I'd be really surprised if that were happening.
Could you send us your SchedLog? I think that's the most likely log to
give us some useful information.
We actually have a test for DAGs getting correctly restarted across a
Condor restart, so I'm a little surprised this is happening.
Something else I just thought of -- you might want to try doing
condor_hold and then condor_release on one of the DAGs, to see if that
gets it to run (just a wild guess).
Kent Wenger
CHTC Team