Re: [HTCondor-devel] Future of PrivSep, interested in feedback/opinions


Date: Tue, 23 Apr 2013 12:41:19 -0500
From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
Subject: Re: [HTCondor-devel] Future of PrivSep, interested in feedback/opinions
On Apr 23, 2013, at 12:16 PM, Igor Sfiligoi <sfiligoi@xxxxxxxx> wrote:

> Hi Todd.
> 
> PrivSep is big cousin of glexec operation mode, right?

They're more siblings.  If you implement feature A with PrivSep, then you have to do a second implementation for glexec.  Since they're related, it's likely the second implementation is easier, but there's still a second implementation.

Take, for example, condor_tail.  To implement it, you'd could write a helper executable that you can point at the running job's sandbox and returns the appropriate information back to the starter over a pipe, then execute the helper using PrivSep/glexec.  The implementation between the two probably has a 90% overlap.

For the container work, the changes would have to go into the setuid daemon itself.  There would be no commonality between the glexec implementation and the PrivSep implementation.

> And the OSG VOs need the glexec to work to the best of its options.
> I.e. glideins need something along the lines of PrivSep, since running as root is not an option, but we still want privilege separation.
> 
> So, I think you should go for (1)...
> and actually push it a little further and make sure everything works in "PriveSep" like mode, which includes glexec integration.
> 

Why not use (2)?  Continue supporting existing functionality, but don't target new functionality.

Brian

> Cheers,
>  Igor
> 
> PS: Pure user-level containers are still too far away to count on them exclusively.
> 
> On 04/23/2013 10:10 AM, Todd Tannenbaum wrote:
>> 
>> The list of things that do not work properly if you are running your execute nodes with PrivSep enabled keeps growing.  Off the top of my head, I boldly claim that condor_tail, job x509 proxy updates (gt #104), upcoming Lark work, and most of the job container functionality incl cpu affinity and cgroup containers (limits on memory, process tracking, pid namespace, file namespace) do not work.  Do folks agree ?  Esp re the job container stuff, I am not positive, but I think it is likely borked w/ PrivSep.
>> 
>> We have three options: (1) make everything work with PrivSep, (2) document all the stuff that stops working if you enable PrivSep and let admins do the risk/benefit analysis themselves, or (3) get rid of PrivSep.
>> 
>> Option #1 : I have a feeling for what it would take to fix proxy updates (couple days) and condor_tail (few days), but no good feeling re the job container work.  At first blush it seems like a really big task (many weeks? a few months when all said and done?).  Not sure it is worth months.
>> 
>> Option #2 : Seems like another example of 'punt to the user'.  I think most admins would opt for the job container stuff over priv sep.
>> 
>> Option #3 : If we got rid of PrivSep, it would lessen many code paths to continue to support, test.  Plus, would anyone miss it?  Is anyone beyond UW-Madison even using PrivSep (just sent that question out to condor-users)?  My guess is until it is "on by default", very few places will ever use it, and again I think most folks would rather see the container stuff on by default over privsep. Long term, the container stuff may be available to non-root (in RHEL 8 or so), which makes the motivation for PrivSep in general less relevant - HTCondor could run as the same user as all the jobs, and containers would prevent jobs for tamperings w/ the HTCondor daemons or other jobs.
>> 
>> Thoughts? Comments? I am mis-understanding something?
>> 
>> Thanks
>> Todd
>> 
> 
> _______________________________________________
> HTCondor-devel mailing list
> HTCondor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel

Attachment: smime.p7s
Description: S/MIME cryptographic signature

[← Prev in Thread] Current Thread [Next in Thread→]