[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] startd_cron jobs that get stuck



Hi,

amongst others we do some FS-checks in startd_cron in order to make sure the mounted filesystems are responsive. 

In the rare case of failure - currently CVMFS is problematic the test hangs. I tried to use the 'timeout' util in bash to wrap these checks but it does not work as expected. 

For a dirty solution I tried to monitor the check in question by adding some logic to another more robust check but it seems as if once a startd_cron job is stuck the other job is unable to propagate actual startd classadds ? 

Best
christoph 

-- 
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx