On 3/24/2014 9:58 AM, Dimitri Maziuk wrote:
On 3/22/2014 5:51 PM, Rita wrote:I was wondering if anyone runs a web server for startd nodes. I was thinking of exposing the system load average, disk space for all partitions, output of `top` (update it every 10 seconds), output of free, etc..when startd was running on that host.You might want to to look at net-snmp and nagios instead. Load average and disk space are built-in, the others can be done with some scripting. Dimitri
On a very similar note, could run Ganglia. Support for Ganglia monitoring is integral to HTCondor nowadays. Ganglia out of the box gives all the typical things (load average, disk space, etc). Then you can also run the condor_gangliad (appeared starting with v8.1.x) under the condor_master to augment Ganglia data with HTCondor-specific metrics.
More info in the HTCondor Manual at : http://research.cs.wisc.edu/htcondor/manual/v8.1/3_10Pool_Management.html#37725 regards, Todd