Re: [HTCondor-devel] Remote IO in vanilla universe


Date: Mon, 28 Jan 2013 08:29:59 -0500
From: Douglas Thain <dthain@xxxxxx>
Subject: Re: [HTCondor-devel] Remote IO in vanilla universe
Brian -

You might check out the existing chirp_fuse module, which should
interoperate with both the Condor Chirp I/O proxy as well as the
standalone Chirp server. When used with the latter, you also get
proper errnos, timeouts, and transparent failure recovery.

http://www.cse.nd.edu/~ccl/software/manuals/man/chirp_fuse.html

Cheers,
Doug


On Sun, Jan 27, 2013 at 9:46 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:
> Hi all,
>
> Figured out a relatively simple way of providing remote IO in the vanilla
> universe and am looking for someone willing to give it a spin.  It's a
> surprisingly small amount of code - the heavy lifting is done by chirp.
> Mostly, the new code is just gluing pre-existing components.
>
> See the design document:
>
> https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3465
>
> In short:  I created a FUSE filesystem that translates filesystem calls to
> chirp IO (which does a remote IO with the submit host).  I use the
> filesystem namespaces feature to make this filesystem only appear to the job
> (and be automatically unmounted at the job's end).  This way, the job sees
> the filesystem of the submit host (either as / or as /condor/submitter,
> depending on the job's requested options).  The technique appears to work
> well, but I haven't tried pushing it too hard.
>
> I'm not quite sure where Chirp breaks, but I did notice that it has no error
> codes implemented (either returns 0 or -1, no errno).  Hence, any IO error
> is converted to EIO.  That will likely be problematic for some applications.
> Chirp also has no timeouts or error recovery; the filesystem will likely die
> if the shadow restarts.
>
> Enjoy!
>
> Brian
>
> _______________________________________________
> HTCondor-devel mailing list
> HTCondor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel
[← Prev in Thread] Current Thread [Next in Thread→]