Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor 6.9.2 hung schedd
- Date: Wed, 13 Jun 2007 18:20:41 +0200
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: Re: [Condor-users] Condor 6.9.2 hung schedd
On Wed, Jun 13, 2007 at 08:59:59AM -0700, Stuart Anderson wrote:
> > You could use condor_qedit to change the value of UserLog for the
> > problematic jobs and then remove them.
did hang the same way...
> You can also use lsof on the hung schedd processes to find the offending
> file, move it to the side and restart schedd. That has worked for us in
> the past.
That did the trick (it's a bit time-consuming if you have to remove dozens
of files, and are not allowed to fiddle with the directory they're in),
Thanks Stuart!
Question to Condor developers: where's the status of submitted jobs kept
over a restart of condor_schedd? It might be easier to make changes there...
And why doesn't 'condor_restart -sub schedd' work in this case?
Steffen