Re: [DynInst_API:] the cause of uninstrumentable functions


Date: Thu, 12 Feb 2015 15:15:43 -0600
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] the cause of uninstrumentable functions
On 02/12/2015 12:05 PM, Victor van der Veen wrote:
I'm using the DyninstAPI to instrument a number of libc functions on my
x86-64 machine. This is working successfully most of the time, but some
functions are unfortunately rendered not-instrumentable by Dyninst
(isInstrumentable() returns false). I also noticed that calling
BPatch_function->findPoint(BPatch_exit) for these functions does not
return much either. I now assume that Dyninst is having problems
generating a CFG for these functions.

Yes, generally uninstrumentable means we believe that a function has
unresolved indirect control flow. This may be because there's a jump
table/indirect tail call/whatever that we're not resolving that's
properly part of the function, or it may be because we've incorrectly
decided that the function shares code with something else that has
unresolvable indirect control flow.

It could be the latter actually...

Appears not to be in at least the strcmp_sse3 case, for what that's worth; I've had a little bit of time to investigate this. There's just an indirect branch that we don't presently resolve.

Better resolution of these branches is an ongoing research topic in the group; I'll confirm whether the improvements we're working on deal with these functions.

I'm wondering if anyone has seen similar issues and can perhaps explain
why this is happening. Does Dyninst indeed have problems rendering a
CFG? Example libc functions that are causing issues are:

free (__cfree)
strncmp (__GI___strncmp_ssse3)
memcmp (__memcmp_sse4_1)
strcmp (__strcmp_ssse3)

As you can see these are mostly optimized implementations. When looking
at the disassembly of libc (using objdump), I wonder if Dyninst has
perhaps problems interpreting certain nop code? For example:

     83a75:   66 66 2e 0f 1f 84 00    data32 nopw %cs:0x0(%rax,%rax,1)
     83a7c:   00 00 00 00

Could it be the case that Dyninst is interpreting this as data which
then cause a faulty CFG to be generated?

Pretty sure it's not the nops being interpreted as data, though I could
have sworn that a double-66 prefix is illegal...

I'll have a look.

I will have a more detailed look later to see if there are other ways
for me to still instrument these functions, but I figured that someone
here perhaps has an answer already.

I'm running Ubuntu 14.04 and Dyninst 8.1.2.

If memory serves, quite a few of these should be fixed in Dyninst 8.2.
There were some bugs that interfered with our ability to cleanly parse
quite a few optimized libc constructs in 8.1.2, both in the
false-sharing category and in the failure-to-resolve category.

There may still be some uninstrumentable functions in libc in Dyninst
8.2 but there should be considerably fewer of them. Let us know if you
find important ones.

Ah sorry, that was a typo. I'm at Dyninst 8.2.1 - not 8.1.2. So I guess
the above are quite important?

Well, it's important to be able to instrument everything that users care about, provided that a) it's possible at all and b) it provides the results they expect/want. I don't think there's anything tricky about str(n)cmp/memcmp/free that would make them not worth worrying about.

There will always be some percentage of functions that we mark uninstrumentable until/unless we change some of our fundamental abstractions/assumptions (e.g. PLT stubs, which are always functions and always uninstrumentable). But ultimately we do want to be able to instrument all "real" functions safely.

Cheers,
Victor


_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api



--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]