Hi Dan,
On 08/16/2012 10:42 AM, Dan Bradley wrote:
If the problem was caused by DEFRAG_REQUIREMENTS and/or
DEFRAG_WHOLE_MACHINE_EXPR, the defrag log would indicate so with a
message like the following:
"Drained 0 machines (wanted to drain X machines)."
"Doing nothing, because DEFRAG_MAX_WHOLE_MACHINES=X and there are Y
whole machines."
Right, I'm not seeing that message.
As a sanity check, what numbers do you see in the following line in the
log when defrag starts up or is reconfigured?
"polling interval %ds, DEFRAG_DRAINING_MACHINES_PER_HOUR = %f/hour =
%d/interval + %d/hour + %d/day"
08/15/12 15:07:13 polling interval 90s,
DEFRAG_DRAINING_MACHINES_PER_HOUR = 12.000000/hour = 0/interval +
12/hour + 0
And what numbers do you see in the most recent log line of the following
form:
"There are currently %d draining and %d whole machines."
08/16/12 12:09:31 There are currently 0 draining and 0 whole machines.
One word of warning: defrag drains the whole startd, partitionable slots
and static slots alike. If you only want it to drain some slots and not
others, you need to run multiple startds and set DEFRAG_REQUIREMENTS to
only match the slots of the startd to be drained and not the slots of
the other startd.
OK, so do I infer that the defrag will only work on machines where there
is only one whole-machine slot? Or just that it will drain single-core
slots in addition to the partitionable ones?