[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-devel] Avoiding redundant executables in the SPOOL
- Date: Fri, 2 May 2008 02:30:43 +0100 (BST)
- From: Bruce Beckles <mbb10@xxxxxxxxx>
- Subject: Re: [Condor-devel] Avoiding redundant executables in the SPOOL
On Thu, 1 May 2008, Alan De Smet wrote:
A few general points:
- It will certainly be possible to disable this functionality.
That's good to know. :)
- The default configuration will something at least as secure as
MD5 or SHA-1. This puts the odds of an accidental collision at
1 in 2^128. This is such a small chance that it's worth
treating a impossible.
Actually, the birthday problem tells us that you should expect a collision
by chance about 1 in 2^64 for MD5 and 1 in 2^80 for SHA-1. (For an n-bit
hash it's about 1 in 2^(n/2).)
But in any case this assumes that all possible output hashes with your
hash function are equally likely, i.e. the hash function is perfectly
balanced. Is this known to be true for both MD5 and SHA-1? If it is not
true, then the probability of accidental collision is lower. Admittedly,
even if the chance of accidental collision is lower than 1 in 2^64, it may
still be high enough not to worry about.
One other thing has just occurred to me: I'm a bad user. I notice that a
good user G often runs the same executable, with hash XXXX. Next time
they don't have any jobs in the queue I put a malicious executable in
SPOOL called exe-G-CmdHashMD5-XXXX. (I hard link to it, so that the
schedd never gets rid of this hash path in the future, and I make sure
that its permissions are such that anyone can read it.)
When user G next submits a job with the executable whose hash is XXXX,
presumably they'll actually run my bad executable exe-G-CmdHashMD5-XXXX?
Does the schedd check the owner of the hash path? What are the
default permissions on the SPOOL directory (i.e. would users by default be
able to create files in SPOOL)?
Other comments inline below.
-- B
<snip>
I'll note that if you've got users submitting arbitrary
executables to a shared account, you've already got a risky
situation with very tricky configuration requirements. If those
executables have WRITE access to the schedd, on the same machine
or over the network, it will be able modify other jobs with the
same Owner easily. And while Condor tries really hard to track a
user job, it is possible for a job to sneak a process out. That
rogue process can then go on to mess with later jobs that show up
under the same user. If your machines are willing to run two
jobs at the same time, it's possible for user A and B's jobs to
run at the same time and potentially mess with each other.
Dedicated run accounts can solve the last two issues, but the
jobs no longer run as the user, and this isn't the default. So
an admin with this environment already facing a difficult
challenge.
Out of curiosity, is anyone aware of a pool run in this way? I'm
aware of front ends that submit all jobs as the same user, but
they run limited executables. I'm aware of front ends that allow
arbitrary executables, but they submit the jobs as different
Owners. Given the challenges of securing such a system, I'd be a
bit surprised to learn that someone was doing it.
Well, I guess the thing to say is that it's not that difficult to set up a
system with the following properties:
(a) Neither users nor job executables ever have any direct access to the
schedd - this might involve _not_ allowing standard universe jobs -
user interaction is mitigated via a front-end;
(b) Either execute machines only runs one job at a time, or jobs are run
under different dedicated accounts; and
(c) The environment on the execute host is sterilised before and after job
execution (either in the traditional way (processes killed, files
deleted, etc.) or by using virtual machines);
...if you've gone to all that trouble you don't want to find that a
security hole has been introduced by an apparently benign change in later
versions of Condor.
We already do (b) and (c), and, in hindisght, would have done (a) if we'd
thought about the implications back when we were starting out.
If we'd done (a), I think it would be quite likely that we might have had
all jobs submitted to Condor as the same Owner because that would make our
lives so much easier in several regards; I accept that this might not be
the case for everyone.
That said, I agree, we don't want to have someone upgrade Condor
and introduce a security hole into their configuration. We'll
keep this in mind when we consider making this the default.
Either way, I'll make sure that you can turn this off. If it's
on by default, the upgrade notes will have a clear warning about
the change in behavior.
Cool.
I'd been mulling letting you do their own hashing. This would
help in this case if you wanted the linking functionality. To
make up a syntax right now, instead of saying,
"CmdHashParts=CmdMD5", you could say
"CmdHashParts=CmdMD5,UserEmail" and add UserEmail or some other
identifying information for your user to the job. Exactly who
can specify the hash, the user, the admin, or both (admin
overriding user), I'm still considering.
I like that idea. I vote for such a feature. :)
- I use "pool accounts" for my users, i.e. there is a collection of
accounts that get shared out between users depending on who is using my
resources at the time. User B submits an executable with the same hash
as user A, but which is _not_ the same executable as that submitted by
user A (hash collision). User B is/was using pool account ACC01. User
B leaves (temporarily or permanently), but doesn't bother to make sure
they've removed all their existing jobs, and user A now gets assigned
pool account ACC01.
We're talking about two different users who end up sharing the
same "Owner," yes? Because if the Owner is different, there is
no risk. If they are sharing an Owner, the bit about this being
potentially insecure above still applies.
Yes, "sharing" an Owner - the idea is that user B left a job in the queue
and stopped using the system. His account gets recycled to user A, but
the job in the queue doesn't get removed first. You could argue that this
is poor housekeeping on the sysadmin's part (and indeed it is), but I know
that it does happen in the real world.
- Multiple submitters to the same schedd (e.g. via Condor-C,
etc). If a bad person can masquerade as me on ANY of those
submitters they can submit a bad executable that will be run
instead of the one I actually submitted.
You might say: "but in that case you've got a problem anyway".
Yes, but previously the bad person could submit jobs and
possibly (depending on the set-up) affect my jobs currently in
the queue. Now, they can DEFINITELY affect my jobs by causing
them to run a bad executable. That bad executable might steal
the input data of my real jobs (which might be valuable) and
send it to the bad person.
If you have submit access and can spoof someone else, you have
qedit and rm access (they're all just WRITE access). Given that,
I can wreak havoc on your jobs today.
I think I didn't express myself very well (mentioning Condor-C was a red
herring). The idea is that you might have something between the different
submitters and the schedd that only allows job submission but not
condor_qedit, etc. I know that Condor doesn't let you do this, but I
could rig up something that does. So I could ensure that users can submit
jobs (from several different machines), but not directly talk to the
schedd. In essence, the scenario is multiple "front-ends" (maybe
"interfaces" is a better term) all talking to the same schedd - I guess
the security analysis is not much different to what's been already
discussed.
<snip>
If hard links seem to work, but don't work as assumed, as you
note, everything will work, you'll end up wasting CPU time and
disk space. I've not encounter a file system with such behavior,
but perhaps I'm lucky. Is this sort of madness common?
I don't _think_ it is common, but my colleagues here assure me that it
does exist. :) I would expect it to become more common as we get more and
more weird and wonderful filesystems appearing, many of which aren't
really filesystems as we know them, but something else entirely
masquerading as a filesystem, e.g. sshfs (although that straightforwardly
doesn't support hard links (yet)).
<snip>
...so if a user can create a hard link to the hash path then
the ickpt file will never be unlinked. What are the
circumstances in which a user can do this?
If a user can create hard links to file in the SPOOL today you
have the same problem. Nothing changes.
Except that all a user gains by doing this today is that they stop the
disk space from being freed when the schedd deletes the ickpt file.
Now, if they are a bad person who has constructed a hash collided
executable and got it into the queue, they can make sure that the hash
path stays around (presumably forever?), but without it being obvious to
anyone that this has happened (as would be the case if they've had to put
a job on hold - the other obvious way to make a bad executable hang around
in the SPOOL).
In particular, suppose a bad person gets hold of the user's account and
submits some jobs. They are discovered and the jobs are condor_rm'd and
the user changes their password, etc. If the bad person has hard linked
to the hash path then their bad executables will still be around, even
though the user and sysadmin may think everything is now okay.
--
Bruce Beckles,
e-Science Specialist,
University of Cambridge Computing Service.