HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] A cautionary tale: libclassad.so.2, CONDOR_IDS and LD_LIBRARY_PATH



A change recently went into master branch so that condor binaries link dynamically against libclassad.  Another related change is that the classad API was updated and so the version number on the libclassad.so has been bumped so that the binaries are now set to link against "libclassad.so.2" instead of "libclassad.so.1"

This change interacted with my development environment in the following way:

I habitually run condor_master as root, and let it drop privs to CONDOR_IDS.  However, when I built the latest master I ran into the following sequence of events:  condor_master started up, dropped privs to CONDOR_IDS, and then spawned the usual daemons.  The daemons attempted to load "libclassad.so.2".   I had LD_LIBRARY_PATH  properly set, however LD_LIBRARY_PATH is invalidated when a process has changed privs.   So the daemons punted to looking for standard packages.  My installed classad packages are all (appropriately) "libclassad.so.1", being the earlier API.  So the daemons failed.   However, being subprocesses, their link error message went into the aether.

Tim and I burned several hours scratching our heads about what was going on until some calls to strace pulled out the hidden link failure messages, and we deciphered the interaction between my running as root, the priv changes, LD_LIBRARY_PATH and the difference between ".so.2" and ".so.1"

I expect that only developers are likely to invoke the combination of conditions that caused this.  If you run condor with no priv-dropping, then any setting you have for LD_LIBRARY_PATH will work fine.  Or, if you have the ".so.2" versions of the classads installed in a standard location so that LD_LIBRARY_PATH is not necessary, you will also be fine.  Earlier revs of condor link to classads statically and so none of this is possible.