Hi,
I came back to my cluster after a few days away and found that the collector daemon was down. Here’s the output from a quick attempt to restart the master on the central manager
[steve@queen condor]$ condor_status
CEDAR:6001:Failed to connect to <158.119.147.62:9618>
Error: Couldn't contact the condor_collector on queen.bioinformatics.
...... standard “Extra info:” text ......
[steve@queen condor]$ condor_restart
Can't connect to local master
[steve@queen condor]$ su
Password:
[root@queen condor]# condor_master
[root@queen condor]# ps -ef | grep condor
root 29265 29101 0 14:22 pts/1 00:00:00 grep condor
As you can see there are no condor processes running.
An ls –l of the logs directory before & after the above commands shows that the Masterlog access time has been updated, but the file size is the same as before the commands were issued. The last few lines of the Masterlog are dated 8th April, which makes me think that it’s being opened and then closed without anything actually happening. All other logs are showing last access at least a week ago.
This was set up by a contractor who’s no longer around and while I don’t think he set any “special” parameters I can’t say for certain.
I’m running CondorVersion 7.0.5 on a Rocks v5.1 cluster of CentOS 5.2 boxes. We’ve got Quill & CondorView plumbed in as well. I can post config files on request.
As it’s on Rocks we’ve also tried running service rocks_condor (stop|start|restart) which has no effect … I think this is just a wrapper for the regular condor_* commands.
I can’t work out why condor_master is not doing anything and I’d really appreciate any advice.
Thanks
Steve
Bioinformatics Support Coordinator
Statistics, Modelling and Bioinformatics
Health Protection Agency
Centre for Infections
61 Colindale Avenue
London
NW9 5EQ
************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk **************************************************************************