Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] How to configure condor to detect node failure and reschedule jobs in 1 minute?
- Date: Tue, 14 Apr 2015 07:42:30 +0800
- From: 163éç <mailtoantares@xxxxxxx>
- Subject: [HTCondor-users] How to configure condor to detect node failure and reschedule jobs in 1 minute?
Hi, It seems that condor take long time to determine reschedule jobs on crashed machines, and they'll be in X state when I remove them manually. It is not feel good if there are only 4 machines in pool.
Condor is very configurable, so how to make it more responsive in this case?
Thanks in advance!
Kyle Qian