On Dec 27, 2006, at 3:43 PM, Dan Bradley wrote:
Unfortunately, COD doesn't currently support transferring any files, including the x509 proxy file. Is it possible for you to rely upon a shared filesystem for this purpose?
No unfortunately, because the Condor installation is glide-in based (most probably the strongest use-case to have gLExec in place). And I don't know how the starter search for the X509. Is this done using the shadow on the head node? Or should the x509 be somewhere on the worker?
I haven't thought through what permissions would be necessary in order for this to work for gLExec.
Afaik, a COD is just a special kind of job with short latency because there is no negotation and preemption is always on. Probably it should work as any other jobs when gLExec is active, so the startd is doing the right thing. Without knowing the internals of the mechanism, I can't say if it just needs to have an env variable in place to work or something else.
Can you or someone else confirm that the COD feature is not yet ready to be used in conjuction with gLExec? I just want to be sure that there is nothing else I can do from my side.
Renzo
--Dan Renzo Borgatti wrote:Thanks Dan. I added both 'IWD' and 'User' without success. But I found a startd not crashing and a StartdLog more verbose: 12/27 14:33:52 DaemonCore: Command received via TCP from condor@fcdfcaf445 from host <131.225.240.106:35843> 12/27 14:33:52 DaemonCore: received command 1000 (CA_AUTH_CMD), calling handler (command_classad_handler) 12/27 14:33:52 Serving request for CA_ACTIVATE_CLAIM by user 'condor' 12/27 14:33:52 vm2: State change: Suspending because a COD job is now running 12/27 14:33:52 vm2: Changing activity: Retiring -> Suspended 12/27 14:33:52 vm2: cannot use glexec to spawn starter: no proxy (is GLEXEC_STARTER set in the shadow?) 12/27 14:33:52 vm2: writeJobAd: Write_Pipe failed 12/27 14:33:52 vm2: ERROR: exec_starter returned 0 Looks like gLExec activation is used also to activate COD. I didn't mention before that gLExec is active in my configuration. The error has something to do with the X509 proxy not present. Is the mechanism to transport X509 the same as universe=grid jobs? Is it possible to specify with what X509 proxy the COD should run under? Thanks Renzo On Dec 27, 2006, at 12:29 PM, Dan Bradley wrote:Hello,I have a hunch that some of the ClassAd attributes that the COD manualclaims are optional are actually required. --Dan Renzo Borgatti wrote:Hi,I have a problem activating claims using COD (Condor 6.9.0). This iswhat I'm doing:condor_cod request -addr "<131.225.212.148:39446>" -classad ci.outSuccessfully sent CA_REQUEST_CLAIM to startd at <131.225.212.148:39446> Result ClassAd written to ci.out ID of new claim is: "<131.225.212.148:39446>#1167216341#4"condor_cod activate -id "<131.225.212.148:39446>#1167216341#4" -classad ci.out -jobad TestCod Attempt to send CA_ACTIVATE_CLAIM to startd <131.225.212.148:39446> failed Reply ClassAd returned 'Failure' but does not have the ErrorString attribute On the worker node, I can see the following two lines in the StartdLog right before crashing: 12/27 11:50:05 DaemonCore: Command received via TCP from condor@fcdfcaf444 from host <131.225.240.106:45123> 12/27 11:50:05 DaemonCore: received command 1000 (CA_AUTH_CMD), calling handler (command_classad_handler) while in the MasterLog: 12/27 11:55:30 The STARTD (pid 15721) died due to signal 11 12/27 11:55:30 All daemons are gone. Exiting. 12/27 11:55:32 **** condor_master (condor_MASTER) EXITING WITH STATUS 0 TestCod is a file with the following 2 lines: Cmd="/bin/ps" Args="-aux" Am I using condor_cod the right way? Is there a way to have more debugging information to understand what happened? Thanks Renzo _______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at either https://lists.cs.wisc.edu/archive/condor-users/ http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR