On 7/3/07, Esh Esh <eshforcondor@xxxxxxxxx> wrote:
Hi, I am trying to submit 50000 job to condor system (Condor 6.8.2) in loop using webservices. Every time I run this program I get a error after submitting 32600 jobs. Stack Trace gives: "java.net.connectionexception : Connection Refused" Has any body faced this problem earlier? Is this specific to condor 6.8.2? Or Is there any limit on the number of jobs that can be submitted?
32600 sounds suspiciously like you are running out of file handles or running out of sockets... Are you holding open a file or socket on each submission? If you pop a: Runtime rt = Runtime.getRunTime(); rt.gc(); rt.runFinalization(); after each submission and it then works then it is likely you are failing to tidy up your handles. (you could isolate that by seeing how long it normally takes to do the gc/finalize and just sleeping for that time instead) If it doesn't then it gets more complex... there were previous conversations here about keep-alive on the http connection. That may be a factor Matt