All,
I have a cluster of 6.6.9 on W2k3. I have several jobs that were running and we removed (condor_rm), but after removal they stayed as an 'X' in the queue. An analysis of the queue said they were being removed. While in this state, the node's they were on were stuck being claimed with idle status. After leaving it a week I did a condor_rm -forcex. Now that removed them from the queue, but the nodes are still claimed. Looking in the schedd log I have this
Zombie process has not been cleaned up by reaper - pid 1300
How can I get the nodes unclaimed? Later I'll try to figure out how I got into this problem.
thanks,
Scott
CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE This e-mail, and any attachments, may contain information that is confidential, subject to copyright, or exempt from disclosure. Any unauthorized review, disclosure, retransmission, dissemination or other use of or reliance on this information may be unlawful and is strictly prohibited. AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE Le présent courriel, et toute pièce jointe, peut contenir de l'information qui est confidentielle, régie par les droits d'auteur, ou interdite de divulgation. Tout examen, divulgation, retransmission, diffusion ou autres utilisations non autorisées de l'information ou dépendance non autorisée envers celle-ci peut être illégale et est strictement interdite. |