| Mailing List ArchivesAuthenticated access |  | ![[Computer Systems Lab]](http://www.cs.wisc.edu/pics/csl_logo.gif)  | 
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Condor-c "Detected Down Grid Resource"
Hi,
 
 I'm trying to get Condor-C to work.
 I have two one-node pools each running Condor version 6.7.19 under RH9.
 
 These machines do *not* have any Globus software installed, am I right in saying that it is not necessary for grid_resource = condor?
 
 The pools are setup as per  section 5.3.1 of the docs:
 submit side entries:
 CONDOR_GAHP=$(SBIN)/condor_c-gahp
C_GAHP_LOG=/tmp/CGAHPLog.$(USERNAME)
C_GAHP_WORKER_THREAD_LOG=/tmp/CGAHPWorkerLog.$(USERNAME)
execute side entries:
SEC_DEFAULT_NEGOTIATION = OPTIONAL
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
 When I submit a job I get these events in the job log file:
 
 ...
 000 (021.000.000) 06/14 14:07:32 Job submitted from host: <10.x.x.21:37339>
 ...
 020 (021.000.000) 06/14 14:07:55 Detected Down Globus Resource
     RM-Contact: mg10.x.y.z
 ...
 026 (021.000.000) 06/14 14:07:55 Detected Down Grid
 Resource
     GridResource: condor mg10x.y.z mg10x.y.z
 ...
 
 This is the submit file:
 ---------------------------------------------------------------------------------------
 universe = grid
 executable = test.sh
 output = test.out
 error = test.err
 log = test.log
 
 grid_resource = condor mg10.x.y.z mg10.x.y.z
 +remote_jobuniverse = 5
 +remote_requirements = True
 +remote_ShouldTransferFiles = "YES"
 +remote_WhenToTransferOutput = "ON_EXIT"
 queue
 ---------------------------------------------------------------------------------------
 
 The documentation for Condor-c 
 [http://www.cs.wisc.edu/condor/manual/v6.7/5_3Grid_Universe.html#SECTION00631000000000000000]
 says that there should be a "remote_pool" entry in the submit file to tell condor where to find the collector that will connect the submit machine schedd with the execute machine schedd, if I understand it correctly. 
 
 However the example submit file does not have a remote_pool entry.
 
 I don't get anything in either side's log files to suggest attempted execution
 or even communication so I guess the "Detected Down" stuff means that the pools are not finding eachother at all.
 
 I can flock between these pools no problem.
 
 If anyone has gotten Condor-C to work I'd like to hear from them, thanks.
 Also if anyone can tell me if I am interpreting the instructions in 5.3.1 correctly I'd appreciate it.
 
 Cheers,
 O.C.
  Send instant messages to your online friends http://uk.messenger.yahoo.com