A few questions to help determine whatâs going wrong in your pool:
Can these jobs be found using condor_history? If so, what status do they have?
If you search for these jobsâ ids in the condor_shadow daemon log, do you see error messages?
If you search the daemon longs for these stuck condor_starters, do you see messages like this:
Lost connection to shadow, waiting 2400 secs for reconnect
Are these condor_starters stuck for longer than 40 minutes (or the value of the JobLeaseDuration attribute in the job ad)?
Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project
|