Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Results of jobs comming back unexpected slow
- Date: Wed, 27 Sep 2006 15:29:35 +0200
- From: "Zeeuw, L.V. de" <L.V.de.Zeeuw@xxxxxx>
- Subject: [Condor-users] Results of jobs comming back unexpected slow
LS,
We have a condor pool of >1500 excution nodes (XP) and We have one Central Master server (Linux) from which we submit jobs.
The problem: When I submit 1000 jobs (each is doing exactly the same small (test) computation for about 30 seconds max) the job results are not returning more quickly than if I ran those jobs on a one machine pool ... The results are comming in slowly, one by one, 5 or more seconds between ...
Eventually they will get through. From the log I can see they actualy run on many different execution nodes.
>From Condor_status:
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/WINNT51 1592 1187 320 70 15 0 0
Total 1592 1187 320 70 15 0 0
Some what later:
>From condor_q:
20 jobs; 566 idle, 354 running, 0 held
The submit machine is almost 'idle'. Condor_q says 354 running (?) but the small jobs should run for only 30 seconds and then must return a result. I am waiting now for hours... It seems the results got stuck on the execution nodes not able to deliver to the submit machine...
Any help is appreciated
Sincerely,
Luc de Zeeuw
Rotterdam University