[HTCondor-devel] more isolation - zerovm


Date: Mon, 6 May 2013 14:43:32 -0500
From: Erik Paulson <epaulson@xxxxxxxxxxxx>
Subject: [HTCondor-devel] more isolation - zerovm
This is kind of neat:


It's based on Google's 'Native Client' isolation layer to sandbox a process, and is kind of like the standard universe:

"ZeroVM abstraction is C99 compliant environment with certain parts of POSIX syscall API implemented.  ZeroVM doesn't expose any non C99 or non POSIX API. All ZeroVM magic is handled transparently to the application. In best POSIX/UNIX traditions all IO to and from ZeroVM is modeled as files. Input data is presented to application as STDIN, log as STDERR and output as STDOUT. Communication channels with peer ZeroVM instances are also presented as files. The rest of visible file-system is all transient and memory-backed in current implementation. Standard C99 library and major part of POSIX is available, however, there are some behavioral deviations from what would be expected as "normal" implementation. For example, since ZeroVM is deterministic, time functions always return zero. We assume it is within C99 standard. It could be interpreted by the application as if it is running on infinitely fast computer. Threading is cooperative (handled automatically) and deterministic, hence all thread synchronization primitives are just no-ops. Developing for ZeroVM requires using the provided cross-compilation GNU toolchain."

The neat thing is it takes virtually no time to spin up "a new VM" - in the few millisecond range. There's some notion of interprocess communication, all backed by a ZeroMQ message queue. 

MSR has something similar
but I don't think it's ever shipped 

It might be worth playing around with. It's probably not good enough to run all jobs - I'll bet some of the more complex workflow/jobs touch corner cases that aren't handled well or need too much from their environment, but if you can get some of the more common jobs (I'm thinking things like some of the R jobs) you may be able to migrate and reschedule some of the load in the HTCondor pool and increase goodput. 

-Erik
 
[← Prev in Thread] Current Thread [Next in Thread→]
  • [HTCondor-devel] more isolation - zerovm, Erik Paulson <=