Hi
I'm having problems trying to kill jobs at a certain
time when using Condor 6.6.5 on Win2K. When the job
is killed it continues to hang around in the idle
state indefinitely:
C:\Condor\ics>condor_q -analyze
-- Submitter: 102153-71130c.liv.ac.uk : <138.253.102.153:1042> :
102153-71130c.l
iv.ac.uk
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
---
187.000: Run analysis summary. Of 2 machines,
1 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match, but are serving users with a better priority in the pool
1 match, but prefer another specific job despite its worse
user-priority
0 match, but will not currently preempt their existing job
0 are available to run your job
Last successful match: Tue Jun 22 13:05:31 2004
1 jobs; 1 idle, 0 running, 0 held
The config file looks like:
WANT_SUSPEND = FALSE
WANT_VACATE = TRUE
START = TRUE
SUSPEND = ClockMin > 660
CONTINUE = FALSE
PREEMPT = TRUE
KILL = TRUE
Something seems to be wrong judging by SchedLog:
6/22 13:05:57 DaemonCore: Command received via TCP from host
<138.253.102.153:1365>
6/22 13:05:57 DaemonCore: received command 443 (VACATE_SERVICE), calling
handler (vacate_service)
6/22 13:05:57 Got VACATE_SERVICE from <138.253.102.153:1365>
6/22 13:05:57 Sent RELEASE_CLAIM to startd on <138.253.102.153:1041>
6/22 13:05:57 Match record (<138.253.102.153:1041>, 187, 0) deleted
6/22 13:05:57 DaemonCore: Command received via UDP from host
<138.253.102.153:1367>
6/22 13:05:57 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())
6/22 13:05:57 Scheduler::Relinquish - mrec is NULL, can't relinquish
6/22 13:05:57 Null parameter --- match not deleted
6/22 13:06:04 DaemonCore: Command received via UDP from host
<138.253.102.153:1371>
any ideas ?
thanks in advance
-ian.
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users