Re: [HTCondor-devel] Remote IO in vanilla universe


Date: Mon, 28 Jan 2013 17:06:59 -0500
From: Douglas Thain <dthain@xxxxxx>
Subject: Re: [HTCondor-devel] Remote IO in vanilla universe
- My experience with Linux at various institutions is that FUSE is
generally present in the kernel, but it's about 50/50 as to whether
the user has permissions to access them.  (Typically requires
membership in unix group fuse or fusermount installed with setuid
bit.)

- We can definitely work to make closer integration between Parrot and
Condor out of the box.  However, I would be cautious about over
promising the generality of Parrot to arbitrary applications.  When it
works, it is great.  When there is a problem (e.g.e new system call),
ugly things happen.  When people have ambitious apps, we usually must
work closely with them to get things tuned up right.  (e.g. CMS)

- The Chirp protocol is stateful -- the client must remember open file
descriptors, seek pointers, and usually buffers data for performance.
On a failed network connection, the file must be reopened, seek'd, and
failed writes retried.  To work with checkpointing, we would need to
be able to dump and recover some state.

- I agree that the std universe would be simpler and more portable if
simply a C library that connected to the starter; that would solve
many linking and C++ issues.  It would not simplify the challenges of
trapping the right library calls.  Might be worth thinking about where
the portability effort is spent.

- We can definitely add support for the Condor-specific Chirp RPCs
into the CCL client implementation, that's not a big deal.  I'm not
sure exactly what you mean by a plugin mechanism, since it seems to
make sense for Condor to retain the Chirp server implementation in the
starter.

Doug



On Mon, Jan 28, 2013 at 3:31 PM, Erik Paulson <epaulson@xxxxxxxxxxx> wrote:
> On Mon, Jan 28, 2013 at 02:00:09PM -0600, Brian Bockelman wrote:
>>
>> On Jan 28, 2013, at 1:55 PM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:
>>
>> >
>> > On 1/28/13 1:47 PM, Erik Paulson wrote:
>> >> On Mon, Jan 28, 2013 at 01:29:02PM -0600, Brian Bockelman wrote:
>> >>>> We last looked at this about 1.5 years ago -- and it worked -- but I
>> >>>> don't believe there is any regular testing of the interaction between
>> >>>> the cctools chirp and the condor chirp.  Without that, things may
>> >>>> drift apart over time.
>> >>>>
>> >> I may be out of it, but is there anything in the HTCondor chirp that's not
>> >> in the cctools chirp?
>> >
>> > Commands have been added for getting/setting job attributes.  Not sure if there's anything else.
>> >
>>
>> There's also a command to append events to the ulog.  I discovered that when trolling through the source code, and I think it has the potential to be incredibly useful.
>>
>
> Dear Doug:
>
> Please create an extensions/plugin mechanism for Chirp. Condor can dump
> their Chirp and just build a few small helpers to answer new commands that
> only make sense in a Condor context.
>
> -Erik
>
> _______________________________________________
> HTCondor-devel mailing list
> HTCondor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel
[← Prev in Thread] Current Thread [Next in Thread→]