Re: [DynInst_API:] deadlock in pcEventMuxer.C


Date: Tue, 03 Dec 2013 10:03:47 -0600
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] deadlock in pcEventMuxer.C
On 12/03/2013 05:31 AM, Marc Brünink wrote:
Hi,

I have 2 threads. One calls BPatch::pollForStatusChange and the other
one is doing other things, e.g calling ProcControlAPI::Process::stopProc().

First, I am not sure whether this is ok.

It's really, really not safe on a couple of levels, as you have seen; the process control interfaces (both in Dyninst and in ProcControl) are intended to be accessed from a single user thread, and it is generally unsafe to simultaneously use the Dyninst and the ProcControl interfaces to manipulate the same (set of) process(es). This is because the Dyninst internals are serving as a user of the ProcControl interface, with their own internal synchronization expectations, and there is no way for a user to properly coordinate with Dyninst's access to ProcControl other than by going through the BPatch level interface.

Second, I get a deadlock in pcEventMuxer.C.

The deadlock is triggered in the following way

1. Thread #1 calls PCEventMailbox::dequeue
2. While still in PCEventMailbox::dequeue, Thread #1 calls
Process::getData()
3. Execution of getData is suspended because the corresponding MTLock is
hold by Thread #2
4. Thread #2 calls int_process::waitAndHandleEvents
5. There is a signal pending
6. Thread #2 calls PCEventMuxer::signalCallback
7. On return from PCEventMuxer::signalCallback Thread #2 calls enqueue
via DEFAULT_RETURN
8. Unfortunately Thread #1 still holds the lock for the queue while the
thread also waits for the MTLock that is hold by Thread #2
9. Deadlock

As a fast fix I release queueCond before the call to Process::getData()
in PCEventMailbox::dequeue() and acquire it again once the call to
getData() returned. This seems to work, even though I admit I did not
test it extensively. Comments?

This may work for the particular deadlock you've observed, but this is (per above) a much larger redesign problem in order to actually ensure that we're properly safe for multiple user thread access. Said redesign would a) cause significant pain--every process control rewrite does and b) not AFAIK provide value to the folks writing the checks (though any of them can correct me on that).

Marc


#0  __lll_lock_wait () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x00007f1c9e5b3080 in _L_lock_903 () from
/lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f1c9e5b2f19 in __pthread_mutex_lock (mutex=0x1e088f8) at
pthread_mutex_lock.c:82
#3  0x00007f1c89e47c5d in Mutex::lock() () from /usr/lib/libpcontrol.so.8.1
#4  0x00007f1c89e75009 in MTManager::startWork() () from
/usr/lib/libpcontrol.so.8.1
#5  0x00007f1c89e75171 in MTLock::MTLock() [clone .constprop.1102] ()
from /usr/lib/libpcontrol.so.8.1
#6  0x00007f1c89e77f4c in Dyninst::ProcControlAPI::Process::getData()
const () from /usr/lib/libpcontrol.so.8.1
#7  0x00007f1c8a829f48 in PCEventMailbox::dequeue (this=0x7f1c8aadef98,
block=<optimized out>)
     at ../../dyninstAPI/src/pcEventMuxer.C:492
#8  0x00007f1c8a82a500 in PCEventMuxer::dequeue (this=<optimized out>,
block=<optimized out>)
     at ../../dyninstAPI/src/pcEventMuxer.C:427
#9  0x00007f1c8a82a55d in PCEventMuxer::handle_internal
(this=0x7f1c8aadef80, proc=<optimized out>)
     at ../../dyninstAPI/src/pcEventMuxer.C:118
#10 0x00007f1c8a82a7ab in PCEventMuxer::wait_internal
(this=0x7f1c8aadef80, block=<optimized out>)
     at ../../dyninstAPI/src/pcEventMuxer.C:88
#11 0x00007f1c8a744fbe in BPatch::pollForStatusChange (this=<optimized
out>) at ../../dyninstAPI/src/BPatch.C:1302




#0  __lll_lock_wait () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x00007f1c9e5b3065 in _L_lock_858 () from
/lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f1c9e5b2eba in __pthread_mutex_lock (mutex=0x1e020d8) at
pthread_mutex_lock.c:61
#3  0x00007f1c89e47c5d in Mutex::lock() () from /usr/lib/libpcontrol.so.8.1
#4  0x00007f1c8a82a8a9 in PCEventMailbox::enqueue (this=0x7f1c8aadef98,
ev=...) at ../../dyninstAPI/src/pcEventMuxer.C:442
#5  0x00007f1c8a82adbb in PCEventMuxer::enqueue (this=<optimized out>,
ev=...) at ../../dyninstAPI/src/pcEventMuxer.C:423
#6  0x00007f1c8a82ca51 in PCEventMuxer::signalCallback (ev=...) at
../../dyninstAPI/src/pcEventMuxer.C:315
#7  0x00007f1c89e5ce62 in
HandleCallbacks::deliverCallback(boost::shared_ptr<Dyninst::ProcControlAPI::Event>,
std::set<Dyninst::ProcControlAPI::Process::cb_ret_t
(*)(boost::shared_ptr<Dyninst::ProcControlAPI::Event const>),
std::less<Dyninst::ProcControlAPI::Process::cb_ret_t
(*)(boost::shared_ptr<Dyninst::ProcControlAPI::Event const>)>,
std::allocator<Dyninst::ProcControlAPI::Process::cb_ret_t
(*)(boost::shared_ptr<Dyninst::ProcControlAPI::Event const>)> > const&) ()
    from /usr/lib/libpcontrol.so.8.1
#8  0x00007f1c89e5e21b in
HandleCallbacks::handleEvent(boost::shared_ptr<Dyninst::ProcControlAPI::Event>)
()
    from /usr/lib/libpcontrol.so.8.1
#9  0x00007f1c89e63887 in
HandlerPool::handleEvent(boost::shared_ptr<Dyninst::ProcControlAPI::Event>)
()
    from /usr/lib/libpcontrol.so.8.1
#10 0x00007f1c89e928fd in int_process::waitAndHandleEvents(bool) () from
/usr/lib/libpcontrol.so.8.1
#11 0x00007f1c89ed2210 in
Dyninst::ProcControlAPI::ProcessSet::stopProcs() const () from
/usr/lib/libpcontrol.so.8.1
#12 0x00007f1c89e8519b in Dyninst::ProcControlAPI::Process::stopProc()
() from /usr/lib/libpcontrol.so.8.1
_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api


--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]