HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Official Debian package for Condor



On Wed, Dec 08, 2010 at 03:24:05PM -0600, Peter Keller wrote:
> Condor's stduniv on linux, among other things, is also a link compatible
> interface with glibc. This is because we do interposition at the library
> level during link time to catch all of the system calls for remote i/o
> and some calls for checkpointing.
> 
> Porting stduniv, even from one glibc version to another, is a very complex
> task and requires pretty deep knowledge of unix systems programming on
> linux, historical understanding of linux kernels and glibc revisions going
> back a few years at least, and arcance knowledge of verious reviions
> of gcc compiler runtimes.  It can take me anywhere from 1 to 6 months
> to perform the work and validate it. That's working on it nearly every
> day too...

Thanks for the clarification -- I already felt that porting wouldn't be a
day's job.

> The problem at hand isn't the condor ifdefs, it is understanding the
> semantic and structural changes across time in glibc implementations,
> linux kernels, and gcc/g++ runtimes.

Yes, of course. I hope I did not communicate that I see the ifdefs as a
useless annoyance.

> The maintenance costs of stduniv are pretty high and we're looking to an
> adjunct solution using DMTCP. If you are willing to help out by hacking
> some code in the Condor's test suite to integrate DMTCP with the stduniv
> tests, that would likely be a good use of time and I have a starting
> point for it.

I'm not sure whether I have the skills to do that, but I'll keep it in
mind while getting more familiar with the code.

> However, if you are itching to undertake porting stduniv, I can do my
> best to write up a porting guide for how to, or at least what you should
> watch out for, perform a port of stduniv to a newer version of glibc.

That would be very helpful for getting condor integrated in Debian. I know
that you already support Debian etch and lenny, and will probably
support squeeze when it is released, but I'm hoping that it is possible
to ship a fully tested/integrated condor with some future Debian release
that supports it right from the start.

In Debian's release cycle GLIBC transitions are typically done very
early -- leaving at least a year till the actual stable release comes
out. Do you see the chance to have a Debian GLIBC version supported by
Condor before some future Debian stable release happens? Or in other
words, how do you decide which GLIBC version to support at which point
in time?

Maybe you can think of contributions that would facilitate any
Debian-related porting and integration issue? Would you be, for example,
interested in build/test reports from Debian (and maybe Ubuntu)
development machines submitted to the NMI database?

To give you some background: Over the past year there have been various
discussions in Debian regarding batch queuing systems. Debian currently
comes with Torque, SGE and some others, but somewhat lacks the manpower
to support a large variety reasonably well. We were thinking about
converging on a smaller subset, or ideally one thing that does
"everything". With its feature list and history Condor seems to me like
the ideal candidate. Now I need to evaluate whether integration of Condor
into Debian is feasible (given limited resources on either end). I also
need to evaluate whether a fully functional package is possible (with
checkpointing, etc.) or just a limited subset can be supported. In the
latter case it could still replace, let's say, SGE, but at the same time
Debian developers/users might be less likely to migrate their current
setups and eventually participate in the maintenance of a Debian package
of Condor.

Thanks again for your time,

Michael

-- 
Michael Hanke
http://mih.voxindeserto.de