Hello,
I trying to learn how to setup ganglia to monitor a condor
pool.
I'm currently working on localhost to make things easier. I
configured ganglia and it's working to monitor this 1 node
cluster. The default metrics of gmond.conf are working fine
and appear on the web frontend, but I'm having trouble to get
the condor metrics.
In the GangliaLog I have:
03/09/15 14:20:28
******************************************************
03/09/15 14:20:28 ** condor_gangliad (CONDOR_GANGLIAD)
STARTING UP
03/09/15 14:20:28 **
/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/libexec/condor_gangliad
03/09/15 14:20:28 ** SubsystemInfo: name=GANGLIAD
type=DAEMON(12) class=DAEMON(1)
03/09/15 14:20:28 ** Configuration: subsystem:GANGLIAD
local:<NONE> class:DAEMON
03/09/15 14:20:28 ** $CondorVersion: 8.3.4 Mar 02 2015
BuildID: 304666 $
03/09/15 14:20:28 ** $CondorPlatform: x86_64_Ubuntu14 $
03/09/15 14:20:28 ** PID = 8922
03/09/15 14:20:28 ** Log last touched 3/9 14:20:11
03/09/15 14:20:28
******************************************************
03/09/15 14:20:28 Using config source:
/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor_config
03/09/15 14:20:28 Using local config sources:
03/09/15 14:20:28
/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/local.xxxx/condor_config.local
03/09/15 14:20:28 config Macros = 58, Sorted = 58,
StringBytes = 1697, TablesBytes = 2136
03/09/15 14:20:28 CLASSAD_CACHING is ENABLED
03/09/15 14:20:28 Daemon Log is logging: D_ALWAYS D_ERROR
03/09/15 14:20:28 Daemoncore: Listening at <
0.0.0.0:45401>
on TCP (ReliSock) and UDP (SafeSock).
03/09/15 14:20:28 DaemonCore: command socket at
<xxx.xxx.xx.xx:45401>
03/09/15 14:20:28 DaemonCore: private command socket at
<xxx.xxx.xx.xx:45401>
03/09/15 14:20:28 Testing /usr/bin/gmetric
03/09/15 14:20:28 Loading libganglia libganglia.so
03/09/15 14:20:28 Will use libganglia to interact with
ganglia.
03/09/15 14:20:28 Will perform stats publication every
GANGLIAD_INTERVAL=60 seconds.
03/09/15 14:20:28 Reading metric definitions from
/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor/ganglia.d/00_default_metrics
03/09/15 14:20:48 Starting update...
03/09/15 14:20:48 Ganglia is monitoring 1 hosts
03/09/15 14:20:48 Got 8 daemon ads
03/09/15 14:20:48 Heartbeats sent: 0
03/09/15 14:21:08 Starting update...
03/09/15 14:21:08 Heartbeats sent: 0
Here are my configs of ganglia:
$condor_config_val -dump |grep -i ganglia
DAEMON_LIST = COLLECTOR MASTER NEGOTIATOR SCHEDD STARTD
GANGLIAD
GANGLIA_CONFIG = /etc/ganglia/gmond.conf
GANGLIA_GMETRIC = /usr/bin/gmetric
GANGLIA_GSTAT_COMMAND = gstat --all --mpifile
--gmond_ip=localhost --gmond_port=8649
GANGLIA_LIB = libganglia.so
GANGLIA_LIB64_PATH = /lib64,/usr/lib64,/usr/local/lib64
GANGLIA_LIB_PATH = /lib,/usr/lib,/usr/local/lib
GANGLIA_SEND_DATA_FOR_ALL_HOSTS = false
GANGLIAD = $(LIBEXEC)/condor_gangliad
GANGLIAD_INTERVAL = 60
GANGLIAD_LOG = $(LOG)/GangliadLog
GANGLIAD_METRICS_CONFIG_DIR =
/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor/ganglia.d
GANGLIAD_PER_EXECUTE_NODE_METRICS = true
GANGLIAD_REQUIREMENTS =
GANGLIAD_VERBOSITY = 10
MAX_GANGLIAD_LOG = $(MAX_DEFAULT_LOG)
GANGLIAD daemon is running but I think it's not
transmitting its monitoring data to ganglia.
Do I have to do something to include the condor default
metrics into ganglia?
Well I'm not sure why but I keep getting "Heartbeats sent:
0". I would appreciate some help.
Thanks in advance,
Ricardo Oda