[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Daemons Fail to run on node

Date: Thu, 02 Dec 2010 12:27:43 -0600
From: Dimitri Maziuk <dmaziuk@xxxxxxxxxxxxx>
Subject: Re: [Condor-users] Condor Daemons Fail to run on node

Ian Chesal wrote:

On Thu, Dec 2, 2010 at 1:13 PM, Xenia Fave <xfave2008@xxxxxxxxxx<mailto:xfave2008@xxxxxxxxxx>> wrote:
    Do you mean just rebooting the one node or the entire cluster?


Just the one node where Condor won't start.
See the other email from James Burnash about fsck'ing the file system --in order to do this you'll have to unmount it from *all* your machines.


If it's mounted on other machines: looks like everyone has a local /scratch.

As I recall (haven't seen it in a while) this can error happen when thedisk develops too many bad sectors too fast. Then the filesystem getsro'ed at a lower level than mtab, so mount still shows it as "rw". Ifthat is the case, smartctl and/or dmesg (or /var/log/messages) shouldhave something to say about it. Also, if this is the cause of theproblem, don't bother with fsck, replace the disk.


Dimitri
--
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

References:
- [Condor-users] Condor Daemons Fail to run on node
  - From: Xenia Fave
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Ian Chesal
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Xenia Fave
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Ian Chesal
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Xenia Fave
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Ian Chesal
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Xenia Fave
- Re: [Condor-users] Condor Daemons Fail to run on node
  - From: Ian Chesal

Prev by Date: Re: [Condor-users] Condor Daemons Fail to run on node
Next by Date: [Condor-users] question about condor_chirp and spooled jobs
Previous by thread: Re: [Condor-users] Condor Daemons Fail to run on node
Next by thread: Re: [Condor-users] Condor Daemons Fail to run on node
Index(es):
- Date
- Thread