Mag Gam wrote:
I currently have 1 submit host which has the central manager, schedd, collector, and negotiator. I would like to have another submit host which I can view the queue (condor_q), and the status of all boxes (condor_status). If the first server is down, I would still like to monitor the queue with this server. I don't really care about scheduling and submitting more jobs. Any idea how to do this? I have been looking thru this: http://www.cs.wisc.edu/condor/manual/v6.8/3_10High_Availability.html but it seems like a bit overkill.
You can do condor_status from anywhere, so the only issue is viewing the queue on machine X when machine X is dead.
The schedd High Availability mechanism you reference above would certainly work.
Considering all you want to do is view the queue on machine X when machine X is dead, another idea would be to install Quill -
http://www.cs.wisc.edu/condor/manual/v7.4/3_12Quill.htmlThe idea here is you'd run PostgreSQL (open source database) on a different machine, and configure your schedd on machine X to "echo" all queue information into the database. condor_q can then query either the schedd or the database. If you already have a reliable shared file system available, however, simply following the High Availability section above to have schedd failover may be less hassle than setting up PostgreSQL.
Another primitive but simple idea: a script or batch file to periodically save the output of condor_q to a file on a shared file system or a web page. You could submit this script as a local universe job to your schedd. :)
Todd