[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] startd-cron not working from time to time (?)



Hi,

I am using the startd-cron-feature on 8.4.4 and it seems like the configured time period of 180 sec is working fine until the node is completely busy running all cores. In that case I observe that the startd-cron is not running anymore. In some cases tracing the startd process is sufficient to get the cron to run, sometimes a condor_reconfig does the job. 

Is that a 'known issue' or is there a way around this behaviour ? 

Here are my STARTD_CRON relevant config-lines:

STARTD_CRON_AUTOPUBLISH = 
STARTD_CRON_JOBLIST = NODEHEALTH
STARTD_CRON_NAME = 
STARTD_CRON_NODEHEALTH_EXECUTABLE = /etc/condor/tests/healthcheck_wn_condor.sh
STARTD_CRON_NODEHEALTH_MODE = Periodic
STARTD_CRON_NODEHEALTH_PERIOD = 180s


best regards
        ~christoph


-- 
/*   Christoph Beyer     |   Office: Building 2b / 23     *\
 *   DESY                |    Phone: 040-8998-2317        *
 *   - IT -              |      Fax: 040-8994-2317        *
\*   22603 Hamburg       |     http://www.desy.de         */