Hi all,
A serious problem just happened to my cluster, causing entire shutdown
of condor. The ownership of schedd was was changed to a regular user!!!
How could this happen?
Here is condor related jobs left on the master node which is the submit
machine.
[root@master1 y-61.1]# ps -ef | grep condor
pwang 26763 1 0 Nov18 ? 00:00:00 condor_shadow -f 886.0
<10.10.20.1:34661> -
pwang 26766 1 0 Nov18 ? 00:00:00 condor_shadow -f 886.2
<10.10.20.1:34661> -
pwang 26772 1 0 Nov18 ? 00:00:00 condor_shadow -f 886.1
<10.10.20.1:34661> -
pwang 29394 1 0 Nov18 ? 00:00:00 condor_shadow -f 886.4
<10.10.20.1:34661> -
condor 19319 1 0 Nov21 ?
00:34:54 /home2/condor/sbin/condor_master
condor 19320 19319 0 Nov21 ? 01:43:02 condor_collector -f
pwang 19393 19319 0 Dec09 ? 00:00:06 condor_schedd -f
condor 19401 19319 0 Dec09 ? 00:02:31 condor_negotiator -f
Restarting condor daemons still yealds wrong owner of schedd. I have to
move job_queue.log to another location to start condor correctly.
Can someone tell me where to look for the cause of the problem?
Junjun
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR