Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor match-making problem
- Date: Fri, 2 Sep 2005 13:42:54 -0500
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Condor match-making problem
On Sep 2, 2005, at 12:15 PM, duane waktu wrote:
My bad. Sorry about the silly problem. Anyway, I was able to run
the cmd without any problem. This is what I got displayed after that:
==========================================================
Name OpSys Arch State
Activity LoadAv Mem ActvtyTime
abc [?????????] [????] [????????]
[???] [??] [Unknown]
condor_manager.ttk.com LINUX INTEL Unclaimed Idle
0.000 2048 0+00:00:03
abc.ttk.com LINUX INTEL Unclaimed
Idle 0.000 2048[?????]
def.ttk.com LINUX INTEL Unclaimed
Idle 0.120 2000[?????]
...
(Omitted 1 malformed ads in computed attribute
totals)
==========================================================
Basically 'abc' (the machine name that comes up because of running
condor_advertise) is the same machine that is already running in
the Condor pool 'abc.ttk.com'. My question is it is alright to have
all the fields of machine 'abc' filled with '???' ?
You can ignore the question marks. condor_status is looking for
certain attributes that you're not placing in the ad. The grid
universe doesn't need them, so leaving them out only affects with
condor_status displays.
Another question, when I used this submit file
=============================================
universe = grid
grid_type = gt4
executable = /bin/hostname
log = test.log
output = test.output_$(Cluster)
error = test.error_$(Cluster)
globusscheduler = $$(gatekeeper_url)
requirements = TARGET.gatekeeper_url =!= UNDEFINED
jobmanager_type = Fork
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
queue
==========================================
the job didn't get executed and failed with this error in
GridmanagerLog file:
----------------------------------------------------------------------
-------------------------------------------------------------
9/2 13:02:14 [20664] DaemonCore: Command received via UDP from host
<host_ip:37272>
9/2 13:02:14 [20664] DaemonCore: received command 60000
(DC_RAISESIGNAL), calling handler (HandleSigCommand())
9/2 13:02:14 [20664] (99.0) doEvaluateState called: gmState
GM_DELEGATE_PROXY, globusState 32
9/2 13:02:19 [20664] No jobs left, shutting down
9/2 13:02:19 [20664] Got SIGTERM. Performing graceful shutdown.
9/2 13:02:19 [20664] **** condor_gridmanager (condor_GRIDMANAGER)
EXITING WITH STATUS 0
----------------------------------------------------------------------
-------------------------------------------------------------
Do you get this same error if you replace $$(gatekeeper_url) with the
URL for the machine?
+----------------------------------+---------------------------------+
| Jaime Frey | Public Split on Whether |
| jfrey@xxxxxxxxxxx | Bush Is a Divider |
| http://www.cs.wisc.edu/~jfrey/ | -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+