HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] Avoiding redundant executables in the SPOOL



The problem:

We have actual users using Condor-C submit to submit hundreds or
thousands of jobs with the same executable.  For Condor-C
copy_to_spool must be true.  Furthermore, for Condor-C, we
can't use clusters to share an executable; each job becomes a new
cluster.  You end up with one copy of the executable per job.
Given a moderately large executable, you can end up with
gigabytes of duplicate executables.  This is a waste of disk
space and bandwidth.


The goal: 

When copy_to_spool=true, avoid multiple copies of the same
executable in the SPOOL.  

(The more general goal would be to avoid multiple copies of any
input file in the SPOOL, but that's a harder problem as the input
files can potentially be modified.)


The plan:

1. condor_submit: Note the size of the executable, the last
   modified time stamp on the executable, and the MD5 hash of the
   executable.

2. condor_submit: Write the executable size, time stamp, and hash
   into our own classad as executable_chksum_size,
   executable_chksum_timestamp, and executable_chksum_md5.

3. condor_schedd: Is this functionality disabled?  Skip to step FAIL.

4. condor_schedd: Query for jobs that match the user name, size,
   and hash.  That is, look for jobs that match 'Owner == "adesmet"
   && executable_chksum_size == 123456 && executable_chksum_md5 ==
   "2b00042f7481c7b056c4b410d28f33cf"'.  (If necessary, add in a
                  test on Cluster/Proc to avoid your own classad.)  If no jobs
                  match, skip to step FAIL.  If some of the expected attributes
                  aren't present in the new job (e.g. because this job arrived via
                  a Condor-C not configured to calculate the MD5 hash), skip to
step FAIL.

5. condor_schedd: From the first job you found, note the path to the executable
   in the SPOOL.

6. condor_schedd: link() from the executable found in step 5 to the new
   location.  If link() fails, continue to FAIL.  Otherwise,
   you're done, stop!

FAIL. condor_schedd/submit: It didn't work out.  Copy the
   executable into the SPOOL like you already did.


Further thoughts:


Certain classad attributes are protected from editing.  I do
_not_ plan on protecting executable_chksum_*.  Since Owner is
already protected, a user can only corrupt their own jobs.  The
benefit of linking between users seems minor enough to not bother
with.

We're using hard links to track copies because it frees the
schedd from having to do so.  A problem with tracking usage in
the schedd could lead to deleting an in-use executable, or
leaving an unused executable behind.  This assumes we can
successfully create hard links within the SPOOL directory.  This
won't work on Windows; I think this acceptable for now.  It also
might not work on some obscure file systems; I think this is
acceptable in general.  The failure case in both cases is to
continue the old way.

This adds the cost of running MD5 against each executable, a
potentially expensive operation for a large executable.  Thus,
one can disable it.  We're doing the checksum in condor_submit to
keep from overloading the schedd.  This means that an
ill-configured condor_submit can skip this process, filling the
poor schedd's queue.  This is unfortunate, but the only other
option is to spam the poor schedd with MD5 calculations.

The naive implementation of step 4 requires spinning over
potentially the entire queue.  This can be slow.  If you, say
have 1,000 jobs queued up and add 1,000 more, you're examining
over 1,000,000 ClassAds.  If this proves to be a problem, adding
an index of the checksums should be relatively easy.

The configuration could be as simple as
"EXECUTABLE_CHKSUM_OPTIMIZATION=FALSE".  This would disable
calculating the MD5 sum in submit, but nothing else.  The
remainder of the design gracefully falls back on the old
behavior.

The configuration setting might be clever, letting the schedd
admin specify which checksums to compare.  Checksums we're not
using wouldn't be calculated, allowing admins willing to live
fast and risky to avoid the cost of MD5 while getting the
benefits of linking.  Furthermore, you could list arbitrary
attributes in the job ad, so users could, say, precalculate a
SHA1 and insert it into their job ads.  It also means that we can
add new checksum methods in the future, if we find something
faster than MD5 but still secure, or we discover we need
something more secure.  So:

 # The default: no checksums, disables this linking trick
 # entirely.
 EXECUTABLE_CHKSUMS=
 
 # Check MD5.  Slow, but safe.  You can add
 # executable_chksum_size for extra paranoia.
 EXECUTABLE_CHKSUMS=executable_chksum_md5
 
 # Path, size and date only.  Very fast, moderately risky.
 EXECUTABLE_CHKSUMS=cmd, executable_chksum_size, \
     executable_chksum_timestamp

 # Size only.  Very fast, very risky, but allows sharing an
 # executable even if, say, it's been copied into multiple
 # locations with different timestamps.  Madness for any but
 # highly specialized installations.
 EXECUTABLE_CHKSUMS=executable_chksum_size

If we're feeling really clever, we could have two lists:
SUBMIT_CHKSUMS_TO_CALCULATE and SCHEDD_CHKSUMS_TO_CHECK.

Before I start seriously hacking on this, any feedback?  Are
there any obvious problems?  Am I missing some obvious better
solution?

-- 
Alan De Smet                              Condor Project Research
adesmet@xxxxxxxxxxx                http://www.cs.wisc.edu/condor/