Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] condor_q and condor_status don't agree
- Date: Tue, 13 Dec 2005 16:18:22 +1300
- From: Andrew Mellanby <mel@xxxxxxxxxxxxx>
- Subject: [Condor-users] condor_q and condor_status don't agree
Hi,
I've run into a problem where the outputs from condor_q and condor_status
don't agree.
The worst thing about this is that jobs get 'Matched' but don't ever get
started. This seems to occur when I load lots of jobs (1000+) into the queue.
The central manager is Solaris, the exec hosts are WindowsXP
I'm looking at the config files and wondering whether any timeouts need to be
lengthened or shortened .. for instance, with a JOB_START_INTERVAL of 2
seconds it takes over 15 minutes to start 500 jobs by which time the ClassAd
will be stale as its lifetime is only 15minutes. But I'm not sure whether
that should matter.
Any thoughts ?
Andrew
......
depot mel% condor_version
$CondorVersion: 6.6.10 Jun 13 2005 $
$CondorPlatform: SUN4X-SOLARIS29 $
depot mel% condor_status -total
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/WINNT51 586 117 4 97 368 0
SUN4u/SOLARIS29 2 2 0 0 0 0
Total 588 119 4 97 368 0
depot mel% condor_q
(lots of output)
324.996 mel 12/13 15:47 0+00:00:00 I 0 0.0 java Pauser1
x1k-9
324.997 mel 12/13 15:47 0+00:00:00 I 0 0.0 java Pauser1
x1k-9
324.998 mel 12/13 15:47 0+00:00:00 I 0 0.0 java Pauser1
x1k-9
324.999 mel 12/13 15:47 0+00:00:00 I 0 0.0 java Pauser1
x1k-9
1548 jobs; 1070 idle, 478 running, 0 held
.........................