Subject: [Condor-users] Fwd: Problem using Condor in GT4..
---------- Forwarded message ---------- From: Pushparajan V <vprajan@xxxxxxxxx> Date: Jul 2, 2005 5:43 PM
Subject: Problem using Condor in GT4.. To: discuss@xxxxxxxxxx
Hi,
I have installed condor and globus successfully and condor pool has
four sun nodes. I followed the steps to install scheduler adapter for
condor as in documentation of globus.. now when i tried these commands
i face the following problems...
$ globusrun -f condor.rsl
The job gets submitted successfully but not getting terminated atall..
if i use $ globus-job-run localhost/jobmanager-condor /bin/date
it just hangs on...
so i usually abort the job in each case.
---------------------------------------------------------------------------------------
The RSL file is:
+
( &(resourceManagerContact="garl-sun1.serc.iisc.ernet.in/jobmanager-condor
")
(count=1)
(label="subjob 0")
(environment=(GLOBUS_DUROC_SUBJOB_INDEX 0)
(LD_LIBRARY_PATH /usr/local1/gt4.0.0/lib/))
(directory="/bin")
(executable="/bin/date")
(stdout="/home/rajan/mpitest/condor.out")
(stderr="/home/rajan/mpitest/condor.err")
)
--------------------------------------------------------------------------
here, garl-sun1.serc.iisc.ernet.in is the head node of the cluster running condor-collector.
What is happening?
i have checked globus-condor.log, condor.pm script, and jobmanager-condor files(it is all untouched). The log file created by condor contains the following:
-----------------------------------------------------------
<c>
<a n="MyType"><s>SubmitEvent</s></a>
<a n="EventTypeNumber"><i>0</i></a>
<a n="EventTime"><s>2005-07-02T14:33:19</s></a>
<a n="Cluster"><i>41</i></a>
<a n="Proc"><i>0</i></a>
<a n="Subproc"><i>0</i></a>
<a n="SubmitHost"><s><10.16.21.12:34003></s></a>
</c>
<c>
<a n="MyType"><s>JobAbortedEvent</s></a>
<a n="EventTypeNumber"><i>9</i></a>
<a n="EventTime"><s>2005-07-02T14:57:23</s></a>
<a n="Cluster"><i>41</i></a>
<a n="Proc"><i>0</i></a>
<a n="Subproc"><i>0</i></a>
<a n="Reason"><s>via condor_rm (by user rajan)</s></a>
</c>
--------------------------------------------------------------------
this seems like no useful information from condor log.. the GRAM log
also says that i have aborted the job execution.. globusrun is going on
without termination what to do ?
Is the condor scheduler of GT4 compatible with preWS RSL file?