HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Building a super-procd



Hi Todd,

Sorry for the delay - went off traveling.

There are two issues:
1) I don't think we can reliably detect an unused GID range.  There are other programs which use GID ranges, and there's no way to detect what the sysadmin has configured them to.
2) There are still simple attacks possible when using GID-based tracking.  It's simply not safe against malicious code.

That said, I did some mental exercises to determine what it would take to integrate the connector API into the procd.  It is simply not within my available effort to refactor the procd into a non-blocking event loop.  However, it would be cheaper to implement to have a helper thread that talks to the kernel (with all the requisite downsides of having pthreads in the procd).

So, the options as I see them are:
1) Guess a GID range.  If it blows up, point admins to page 611 of the manual telling them to set things up.  This will not protect against malicious code, but will be excellent against poorly-written code and portable.
2) Introduce a second thread into the procd and have a solution which you can audit, but not portable.  You still don't get perfect protection as with cgroups.

(Note: these don't seem mutually exclusive?)

Hope this helps.  I've got a few hours left in airports today, I might toy around with this idea further.

Brian

On Jul 20, 2011, at 7:49 AM, Todd Tannenbaum wrote:

> Hi Brian -
> 
> Seems like the procd flaw the below would address is that on rhel5 and older systems, the out-of-the-box default configuration of the procd does not always catch all child processes of a job. Is this correct?
> 
> On rhel6+ it is no longer an issue thanks to your cgroups contribution. :)
> 
> But even on rhel5 and older, it is only an issue because many site admins don't configure the procd w/ a small range of unused gids, probably because it is yet one more thing to do at installation. Or the poor busy sysadmin never got around to reading page 611 of the Manual to even know he/she prolly wants to do this.
> 
> To help convince ourselves that the below is the best approach to the problem, lets consider an alternative:  instead of relying on the admin to add something in the config file, the procd could simply automatically select a small (size 64?) unused gid range via a simple self-contained function that scans through /etc/passwd|group to make a map of all used gids. A default gid range (or set of ranges) could be documented, and this function would simply check to see if there is a collision and thus pick which default range to use, or let the admin know if there is no such range that does not already have gids in use.
> 
> Thoughts? Does this address the same flaw as below on older systems, but in a manner portable to any unix and perhaps via a much more self-contained/small change?
> 
> Thanks
> Todd
> 
> -- Sent from my HP Veer mobile phone
> 
> On Jul 19, 2011 5:39 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote: 
> 
> Hi all, 
> 
> The procd is a pretty flawed component out-of-the-box. I would like to invest some of my "night and weekend" time and implement the techniques described here: 
> 
> http://osgtech.blogspot.com/2011/06/part-ii-keeping-mindful-eye-on-your.html 
> 
> Basically, on any Linux 2.6 kernel (including the ones with Debian and RHEL5), the procd can subscribe to a feed from the kernel containing all processes spawned on the system. If a fork bomb occurs such that the procd can't keep up with the incoming feed, we are at least able to detect this occurred and can inform the procd's clients. 
> 
> The implementation really isn't that hard - there's svn code in the blog post for using the connector API - but would require a refactoring of the existing procd to a non-DC asynchronous infrastructure. 
> 
> I believe this would be a major improvement to the procd, usable out-of-the-box today. However, I'd like some assurance that if I did the work, someone with commit privileges would be willing to review and accept the code. 
> 
> Thoughts? 
> 
> Brian 
> 
> _______________________________________________ 
> Condor-devel mailing list 
> Condor-devel@xxxxxxxxxxx 
> https://lists.cs.wisc.edu/mailman/listinfo/condor-devel