HTCondor Project List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] condor and a threaded library

Date: Tue, 29 Jan 2008 11:27:06 -0600
From: Matthew Farrellee <matt@xxxxxxxxxxx>
Subject: Re: [Condor-devel] condor and a threaded library

Todd Tannenbaum wrote:

There's always the worry that we're calling some non-threadsafe Clibrary
function that the library has properly protected but Condor hasn't. That
should be less and less of a worry as time goes on.
You mean that the threaded library code calls a non-reentrant C libfunction, but protects itself in some sane way. However, Condor callsthe same function, except without the library's protection?
Correct. Typical culprits include functions to get host and userinformation. Also, even if the library uses the "*_r()" thread safeversions of function X, it is not clear if the entire process needs touse the *_r functions, or if it is ok for condor to use thethread-unsafe functions combined with _r functions in the library.
Besides C library functions, ditto of the above for things like OpenSSL.
Signal handling could become a pain as well. Condor daemons use signalsto communicate with each other --- combining signals and threads in oneprocess may require work. Useful post-mortem debugging could bepainful. (how well will the google core dump library continue to work,if at all? which thread context gets dumped?)


Noted.

It's also usually the case that in order to do anything useful,somewhere inthat library you have to call back into the Condor code, and then allbets are
off. Non-reenterant Condor utility functions are all over the place, and
there are many places that hold state between calls to the samefunction.
That's not a concern because the library will not call back to Condor.
If it will not call back into Condor, why link it into the Condordaemons? Just to send outbound notifications? Depending upon what sortof notifications and their frequency, perhaps abetter/simpler/safer/more modular design could be found....

Initially the concern is only for outbound notifications. However, therewill be inbound notifications. For inbound the library can provide an fdfor DC to select() on so that any execution of Condor code by thelibrary would be coming from a Condor controlled thread of control. Thelibrary would not call into any Condor code from a thread it controls.

Also, the library would not call exit(), which Condor redefines. Arethere other functions that Condor redefines I'm forgetting? open/closeare #define'd, so they are not a concern.
Sure, lots of them.    See the util lib.

Thanks. That's a list of #define'd and reimplemented functions. I'll seeif I can wade through it and figure out which are which. I know exit()is reimplemented for sure because it is done in daemon_core.

We've been burned time and time again by threads on Windows, eventhough we were convinced that they don't interact with Condor code,and lo, it turnsout we still crash because of race condidtions. It just doesn't seemworth it.
This is because of call-backs or Condor code itself being threaded,right?
That and the items listed above and others not enumerated. Plus wedon't necessarily want threading in Condor, except for very specifictasks such as overlap of CPU and I/O, where what is going on in theworker thread is small/simple to the point of nearly a single systemcall. We may be forced to also do OpenSSL functions in a thread pool atsome point as well, but that doesn't make us happy...
See http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf,http://www.softpanorama.org/People/Ousterhout/Threads/ to understandsome of our thinking. What we have thought about down the road forDaemonCore is a more hybrid thread/event model, something similar tohttp://research.microsoft.com/Farsite/USENIX2002.ps but simpler.
What are you thinking about using?
It's a client messaging library, architected to manages its ownthreads, which would never call back to Condor.
And it would be doing what?

As above, outbound notifications about state and later inboundnotifications to change state.


Best,



matt

Follow-Ups:
- Re: [Condor-devel] condor and a threaded library
  - From: Ian Alderman

References:
- [Condor-devel] condor and a threaded library
  - From: Matthew Farrellee
- Re: [Condor-devel] condor and a threaded library
  - From: Nick LeRoy
- Re: [Condor-devel] condor and a threaded library
  - From: Matthew Farrellee
- Re: [Condor-devel] condor and a threaded library
  - From: Erik Paulson
- Re: [Condor-devel] condor and a threaded library
  - From: Matthew Farrellee
- Re: [Condor-devel] condor and a threaded library
  - From: Todd Tannenbaum

Prev by Date: Re: [Condor-devel] condor and a threaded library
Next by Date: Re: [Condor-devel] condor and a threaded library
Previous by thread: Re: [Condor-devel] condor and a threaded library
Next by thread: Re: [Condor-devel] condor and a threaded library
Index(es):
- Date
- Thread