HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] condor and a threaded library



Erik Paulson wrote:
On Tue, Jan 29, 2008 at 09:18:05AM -0600, Matthew Farrellee wrote:
Nick LeRoy wrote:
On Tue January 29 2008, Matthew Farrellee wrote:
does anyone know of a current reason why a condor daemon could not be
linked with a library that creates and manages its own threads? if so,
what is it?
Most of Condor isn't thread safe -- we've made no effort to keep the code thread safe (lots of static data, etc). If any of these threads interacts with Condor in any way, the results will likely be bad. Similar to the 2nd commandment: "2 Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end."
However, if the threads are confined to the library's code only?


There's always the worry that we're calling some non-threadsafe C library
function that the library has properly protected but Condor hasn't. That
should be less and less of a worry as time goes on.

You mean that the threaded library code calls a non-reentrant C lib function, but protects itself in some sane way. However, Condor calls the same function, except without the library's protection?


It's also usually the case that in order to do anything useful, somewhere in
that library you have to call back into the Condor code, and then all bets are
off. Non-reenterant Condor utility functions are all over the place, and
there are many places that hold state between calls to the same function.

That's not a concern because the library will not call back to Condor. Also, the library would not call exit(), which Condor redefines. Are there other functions that Condor redefines I'm forgetting? open/close are #define'd, so they are not a concern.


We've been burned time and time again by threads on Windows, even though we were convinced that they don't interact with Condor code, and lo, it turns
out we still crash because of race condidtions. It just doesn't seem worth it.

This is because of call-backs or Condor code itself being threaded, right?


What are you thinking about using?

It's a client messaging library, architected to manages its own threads, which would never call back to Condor.


Best,


matt