Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] "can't find resource with capability" error!
- Date: Tue, 15 Jun 2004 10:05:35 -0500 (EST)
- From: "Vinayak V. Dukle" <vdukle@xxxxxxxxxxxxxx>
- Subject: [Condor-users] "can't find resource with capability" error!
Hi!
I hava a simple java universe job that prints out "Hello World!" when
executed from the command line.
The submit file is thus:
###########################
#example 1
# Execute a single Java class
#
############################
universe = java
executable = hello.class
arguments = hello
output = hello.output
error = hello.error
log = hello.log
queue
Observing the following when I submit this job.
The log when I force i.e. use condor_reschedule on my central manager
where schedd is running as "condor" is:
12:31:19am> palomar:/tmp $ 6/15 00:31:45 DaemonCore: Command received via
TCP from host <129.79.246.125:56028>
6/15 00:31:45 DaemonCore: received command 421 (RESCHEDULE), calling
handler (reschedule_negotiator)
6/15 00:31:45 Sent ad to central manager for vdukle@xxxxxxxxxxxxxxxxxxx
6/15 00:31:45 Called reschedule_negotiator()
6/15 00:31:45 Activity on stashed negotiator socket
6/15 00:31:45 Negotiating for owner: vdukle@xxxxxxxxxxxxxxxxxxx
6/15 00:31:45 Checking consistency running and runnable jobs
6/15 00:31:45 Tables are consistent
6/15 00:31:45 Out of jobs - 1 jobs matched, 0 jobs idle, flock level = 0
6/15 00:31:48 match (<129.79.246.123:53162>#2291185139) out of jobs
(cluster id 368); relinquishing
6/15 00:31:48 Sent RELEASE_CLAIM to startd on <129.79.246.123:53162>
6/15 00:31:48 Match record (<129.79.246.123:53162>, 368, 0) deleted
6/15 00:31:48 DC_AUTHENTICATE: attempt to open invalid session
palomar:15231:1087272551:20, failing.
6/15 00:31:50 Sent ad to central manager for vdukle@xxxxxxxxxxxxxxxxxxx
Now, 129.79.246.123's (is this the execute machine?) StartLog says:
[0:34] brick:/u/condor/hosts/brick/log % tail -f StartLog
6/15 00:31:48 Changing state and activity: Claimed/Idle ->
Preempting/Vacating
6/15 00:31:48 State change: No preempting claim, returning to owner
6/15 00:31:48 Changing state and activity: Preempting/Vacating ->
Owner/Idle
6/15 00:31:48 State change: IS_OWNER is false
6/15 00:31:48 Changing state: Owner -> Unclaimed
6/15 00:31:48 DaemonCore: Command received via UDP from host
<129.79.246.145:33347>
6/15 00:31:48 DaemonCore: received command 443 (RELEASE_CLAIM), calling
handler (command_handler)
6/15 00:31:48 Error: can't find resource with capability
(<129.79.246.123:53162>#2291185139)
6/15 00:31:48 DaemonCore: Command received via UDP from host
<129.79.246.145:33347>
6/15 00:31:48 DaemonCore: received command 60014 (DC_INVALIDATE_KEY),
calling handler (handle_invalidate_key())
However, 129.79.246.123 is a java capable host (in fact all machines in
the pool are) confirmed by running "condor_status -java". Also, the
"owner" attribute has my correct username. So am not sure why it gives
the above error("Error: can't find resource with capability") then? What
am I missing here?
The job just sits in the idle state and never runs.
Any pointers would be appreciated. Thanks!
Regards,
--Vinayak