HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Workfetch Architecture Questions



To address issue #1 below, I've created a patch that will set the IWD to be the temporary execute if the ClassAd does not specify the IWD. Let me know if you see any problems or think there should be modifications.

Rob

Derek Wright wrote:
[Replying to the condor-devel list, which is the better place to discuss development questions like this, and which is also archived for posterity...]

On May 9, 2008, at 6:57 AM, Robert Rati wrote:

1) I brought up the issue with the IWD earlier, and found that if I specify "." as the IWD then jobs will run. While this will work I suppose, it doesn't really feel right.

Agreed.

I had a couple of ideas on how to solve this, but I think I should be consistent with how IWD is handled for other condor jobs and want to see which of these ideas is more consistent (or if there is a better way).

Right. ;)

A) Make IWD optional, and if IWD isn't provided in the ClassAd, then set it to "."


More or less, yes. I'd actually set it to the full path to the temp sandbox directory, but effectively that's the same thing.

B) Make the IWD relative to the temporary execute directory condor creates for the job (and thus where the prepare_job hook is run).

No. That'd be confusing and inconsistent with how these things are handled in other cases.

The later seems to make more sense to me, but is that how IWD is handled for ClassAds from other sources? Any better ways?

I think (A) makes the most sense: just make the IWD optional and default to the temp sandbox directory if you don't define it.


2) Condor doesn't ensure that the execute bit is set on the file listed in as the Cmd in the ClassAd. I understand that condor ensures this with work received in other ways?

Only if Condor does the file transfer itself.

If so, should it be the responsibility of the prepare_work hook script to ensure that the execute bit is set, or should condor do that itself? Again, I think the later from a consistency stand point.

Personally, I'd vote that's your problem in prepare_work if you're not sure your executable is already executable and/or you just transfered it yourself. From a security standpoint, I'd be very uneasy about having the starter going off and chmod'ing arbitrary paths on the filesystem that are coming in via job ClassAds and then trying to exec those.

Granted, the starter should do a better job of propagating errors (#include "we-need-hook_starter_failure.h") including "duh, that file isn't executable", but I don't think the starter should be chmod'ing anything it didn't transfer itself. This would also be a useful thing to mention in the hook documentation: "Note, if you transfer your own executable in hook_prepare_work, be sure to chmod it to 755 (or equivalent on windoze) so that the starter can execute it."

Cheers,
-Derek


p.s. Sorry for the delayed reply, you sent this while I was out all last week sick, and it got buried in my overflowing inbox.


diff --git a/RHEL-5-MRG/condor-7.0.1/src/condor_starter.V6.1/jic_local.C b/RHEL-5-MRG/condor-7.0.1/src/condor_starter.V6.1/jic_local.C
index 121d6ce..37a6d9b 100644
--- a/RHEL-5-MRG/condor-7.0.1/src/condor_starter.V6.1/jic_local.C
+++ b/RHEL-5-MRG/condor-7.0.1/src/condor_starter.V6.1/jic_local.C
@@ -350,9 +350,14 @@ JICLocal::initJobInfo( void )
 
 		// stash the iwd name in orig_job_iwd
 	if( ! job_ad->LookupString(ATTR_JOB_IWD, &orig_job_iwd) ) {
-		dprintf( D_ALWAYS, "Error in JICLocal::initJobInfo(): "
-				 "Can't find %s in job ad\n", ATTR_JOB_IWD );
-		return false;
+		dprintf( D_ALWAYS, "%s not found in job ad.  Setting to %s\n",
+				ATTR_JOB_IWD, Starter->GetWorkingDir() );
+		MyString temp_iwd;
+		temp_iwd += ATTR_JOB_IWD;
+		temp_iwd += "=\"";
+		temp_iwd += Starter->GetWorkingDir();
+		temp_iwd += '"';
+		job_ad->Insert( temp_iwd.Value() );
 	} else {
 			// put the orig job iwd in class ad
 		dprintf(D_ALWAYS, "setting the orig job iwd in starter\n");