HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Avoiding redundant executables in the SPOOL



On Thu, 1 May 2008, Alan De Smet wrote:

A few general points:

- It will certainly be possible to disable this functionality.

That's good to know. :)

- The default configuration will something at least as secure as
 MD5 or SHA-1.  This puts the odds of an accidental collision at
 1 in 2^128.  This is such a small chance that it's worth
 treating a impossible.

Actually, the birthday problem tells us that you should expect a collision by chance about 1 in 2^64 for MD5 and 1 in 2^80 for SHA-1. (For an n-bit hash it's about 1 in 2^(n/2).)

But in any case this assumes that all possible output hashes with your hash function are equally likely, i.e. the hash function is perfectly balanced. Is this known to be true for both MD5 and SHA-1? If it is not true, then the probability of accidental collision is lower. Admittedly, even if the chance of accidental collision is lower than 1 in 2^64, it may still be high enough not to worry about.


One other thing has just occurred to me: I'm a bad user. I notice that a good user G often runs the same executable, with hash XXXX. Next time they don't have any jobs in the queue I put a malicious executable in SPOOL called exe-G-CmdHashMD5-XXXX. (I hard link to it, so that the schedd never gets rid of this hash path in the future, and I make sure that its permissions are such that anyone can read it.)

When user G next submits a job with the executable whose hash is XXXX, presumably they'll actually run my bad executable exe-G-CmdHashMD5-XXXX? Does the schedd check the owner of the hash path? What are the default permissions on the SPOOL directory (i.e. would users by default be able to create files in SPOOL)?

Other comments inline below.

	-- B


<snip>
I'll note that if you've got users submitting arbitrary
executables to a shared account, you've already got a risky
situation with very tricky configuration requirements.  If those
executables have WRITE access to the schedd, on the same machine
or over the network, it will be able modify other jobs with the
same Owner easily.  And while Condor tries really hard to track a
user job, it is possible for a job to sneak a process out.  That
rogue process can then go on to mess with later jobs that show up
under the same user.  If your machines are willing to run two
jobs at the same time, it's possible for user A and B's jobs to
run at the same time and potentially mess with each other.
Dedicated run accounts can solve the last two issues, but the
jobs no longer run as the user, and this isn't the default.  So
an admin with this environment already facing a difficult
challenge.

Out of curiosity, is anyone aware of a pool run in this way?  I'm
aware of front ends that submit all jobs as the same user, but
they run limited executables.  I'm aware of front ends that allow
arbitrary executables, but they submit the jobs as different
Owners.  Given the challenges of securing such a system, I'd be a
bit surprised to learn that someone was doing it.

Well, I guess the thing to say is that it's not that difficult to set up a system with the following properties:

(a) Neither users nor job executables ever have any direct access to the
    schedd - this might involve _not_ allowing standard universe jobs -
    user interaction is mitigated via a front-end;

(b) Either execute machines only runs one job at a time, or jobs are run
    under different dedicated accounts; and

(c) The environment on the execute host is sterilised before and after job
    execution (either in the traditional way (processes killed, files
    deleted, etc.) or by using virtual machines);

...if you've gone to all that trouble you don't want to find that a security hole has been introduced by an apparently benign change in later versions of Condor.

We already do (b) and (c), and, in hindisght, would have done (a) if we'd thought about the implications back when we were starting out.

If we'd done (a), I think it would be quite likely that we might have had all jobs submitted to Condor as the same Owner because that would make our lives so much easier in several regards; I accept that this might not be the case for everyone.


That said, I agree, we don't want to have someone upgrade Condor
and introduce a security hole into their configuration.  We'll
keep this in mind when we consider making this the default.
Either way, I'll make sure that you can turn this off.  If it's
on by default, the upgrade notes will have a clear warning about
the change in behavior.

Cool.


I'd been mulling letting you do their own hashing.  This would
help in this case if you wanted the linking functionality.  To
make up a syntax right now, instead of saying,
"CmdHashParts=CmdMD5", you could say
"CmdHashParts=CmdMD5,UserEmail" and add UserEmail or some other
identifying information for your user to the job.  Exactly who
can specify the hash, the user, the admin, or both (admin
overriding user), I'm still considering.

I like that idea.  I vote for such a feature. :)


- I use "pool accounts" for my users, i.e. there is a collection of
   accounts that get shared out between users depending on who is using my
   resources at the time.  User B submits an executable with the same hash
   as user A, but which is _not_ the same executable as that submitted by
   user A (hash collision).  User B is/was using pool account ACC01.  User
   B leaves (temporarily or permanently), but doesn't bother to make sure
   they've removed all their existing jobs, and user A now gets assigned
   pool account ACC01.

We're talking about two different users who end up sharing the
same "Owner," yes?  Because if the Owner is different, there is
no risk.  If they are sharing an Owner, the bit about this being
potentially insecure above still applies.

Yes, "sharing" an Owner - the idea is that user B left a job in the queue and stopped using the system. His account gets recycled to user A, but the job in the queue doesn't get removed first. You could argue that this is poor housekeeping on the sysadmin's part (and indeed it is), but I know that it does happen in the real world.


- Multiple submitters to the same schedd (e.g. via Condor-C,
etc).  If a bad person can masquerade as me on ANY of those
submitters they can submit a bad executable that will be run
instead of the one I actually submitted.

You might say: "but in that case you've got a problem anyway".
Yes, but previously the bad person could submit jobs and
possibly (depending on the set-up) affect my jobs currently in
the queue.  Now, they can DEFINITELY affect my jobs by causing
them to run a bad executable.  That bad executable might steal
the input data of my real jobs (which might be valuable) and
send it to the bad person.

If you have submit access and can spoof someone else, you have
qedit and rm access (they're all just WRITE access).  Given that,
I can wreak havoc on your jobs today.

I think I didn't express myself very well (mentioning Condor-C was a red herring). The idea is that you might have something between the different submitters and the schedd that only allows job submission but not condor_qedit, etc. I know that Condor doesn't let you do this, but I could rig up something that does. So I could ensure that users can submit jobs (from several different machines), but not directly talk to the schedd. In essence, the scenario is multiple "front-ends" (maybe "interfaces" is a better term) all talking to the same schedd - I guess the security analysis is not much different to what's been already discussed.


<snip>
If hard links seem to work, but don't work as assumed, as you
note, everything will work, you'll end up wasting CPU time and
disk space.  I've not encounter a file system with such behavior,
but perhaps I'm lucky.  Is this sort of madness common?

I don't _think_ it is common, but my colleagues here assure me that it does exist. :) I would expect it to become more common as we get more and more weird and wonderful filesystems appearing, many of which aren't really filesystems as we know them, but something else entirely masquerading as a filesystem, e.g. sshfs (although that straightforwardly doesn't support hard links (yet)).


<snip>
...so if a user can create a hard link to the hash path then
the ickpt file will never be unlinked.  What are the
circumstances in which a user can do this?

If a user can create hard links to file in the SPOOL today you
have the same problem.  Nothing changes.

Except that all a user gains by doing this today is that they stop the disk space from being freed when the schedd deletes the ickpt file.

Now, if they are a bad person who has constructed a hash collided executable and got it into the queue, they can make sure that the hash path stays around (presumably forever?), but without it being obvious to anyone that this has happened (as would be the case if they've had to put a job on hold - the other obvious way to make a bad executable hang around in the SPOOL).

In particular, suppose a bad person gets hold of the user's account and submits some jobs. They are discovered and the jobs are condor_rm'd and the user changes their password, etc. If the bad person has hard linked to the hash path then their bad executables will still be around, even though the user and sysadmin may think everything is now okay.


--
Bruce Beckles,
e-Science Specialist,
University of Cambridge Computing Service.