On Mar 11, 2011, at 9:36 AM, Greg Thain wrote: > >> However, to the "outside world", these appear as normal processes. The processes inside the job can't view or contact external processes - two jobs running within the same Unix account can't discover or send signals to each other. Additionally, when "PID 1" dies, the kernel wipes out the remaining processes started by the job. It's a fairly neat trick. This all requires kernel 2.6.24 or later. > > Brian: > > This _is_ a neat trick. It seems unfortunate that /proc doesn't do the right thing automatically. > Fixing /proc is pretty simple, you just need to remount it -- it's taken care of in the patch. I've got a mostly-working patch that removes the dependency with cgroups. > A wonder where the right place to use it is, though -- if the starter were the "init", then if it crashed, processes would get cleaned up, and it would get to reap re-parented subprocesses of the job, and thus get their rusage info. > It is hierarchical - you can do both. The reason I didn't have the starter in the separate namespace was the issue of registering with the procd - the startd would have to do this then. The rule is that a daemon cannot talk to a procd outside its PID namespace. With respect to the rusage info - cgroups takes care of this and more. Unfortunately, cgroups are a much more feature-rich interface (read "complex and easy for my tiny brain to break") compared to a simple system call. > It would be nice to have a wrapper program to create a new pid namespace for subchildren arguments. Then we could just put the master in it's own pid namespace in the init script, something like > > new_pid_namespace condor_master -f ... > Doesn't the condor_master already fork at startup? If it uses DaemonCore::Create_Process, then having it in its own namespace should be simple. This would prevent misbehaving condor daemons from hanging the master, and would allow for very clean init scripts. Another advantage of having the condor_master in its own namespace is that if an evil malicious hacker manages to send commands to the procd, the procd can't "harm" non-Condor OS processes. Because the procd and the startd/master would all be in the same PID namespace, there shouldn't be any issues. Brian
Attachment:
smime.p7s
Description: S/MIME cryptographic signature