Hello,
On Thu, Jul 06, 2006 at 03:40:09PM -0400, Ian Chesal wrote:
> We saw this problem on our live system on the weekend so I had our co-op
> do a detailed analysis for this on our test system and it's very
> re-creatable. The condor_release call consistently times out on clusters
> with more than 1500 processes in them. Thankfully it doesn't
> half-release the cluster. But it still means that if you've submitted a
> cluster on hold with more than 1500 processes in it you're never going
> to get it to run. Is this a known issue?
I don't think that's ever been reported. I'll see if I can reproduce the error
and let you know what I find in a bit.