Re: [DynInst_API:] A question about dynInst's static instrumentation ability

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Date:	Mon, 24 Aug 2015 17:42:08 -0400
From:	Shuai Wang <wangshuai901@xxxxxxxxx>
Subject:	Re: [DynInst_API:] A question about dynInst's static instrumentation ability

Thank you Bill, I looked into the details in the paper and also the instrumented output, I suppose this is a replica-based instrumentation... Anyway, thank you a lot.

On Mon, Aug 24, 2015 at 2:22 PM, Bill Williams <bill@xxxxxxxxxxx> wrote:

On 08/24/2015 01:02 PM, Shuai Wang wrote:

Hello Bill,

Thank you for your response! I didn't know this mechanism before and I
am very interested in!
May I ask how does Dyninst decide when to leverage this optimization?
And when not to optimize?

The quick and oversimplified answer: if a function is known to have unresolved control flow, which would happen as a result of an indirect branch that we can't statically parse, we would not be able to relocate the function safely, as we wouldn't be able to tell whether existing basic blocks were split by control flow we didn't know about. Otherwise, it's safe provided that we appropriately translate all PC-sensitive instructions. I really do recommend reading Drew's paper; it covers this far more precisely than I can over email.

Can I turn on or off this mechanism by configuration?

No, you'd have to work with the Dyninst internals. Once we determined that block-level relocation was horrifically expensive due to instruction cache misses, we rewrote our relocation system to be purely function-oriented.

You might be able to recover an older relocation system from an older version of Dyninst, but I wouldn't recommend that for anything other than satisfying personal curiosity--the older versions are not robust against changes in compilers etc. that have happened since they were released.

Sorry if I trouble you too much.. Looking forward to your response!

Sincerely,
Shuai

On Mon, Aug 24, 2015 at 12:38 PM, Bill Williams <bill@xxxxxxxxxxx
<mailto:bill@xxxxxxxxxxx>> wrote:

Â Â On 08/23/2015 12:22 AM, Shuai Wang wrote:

Â Â Â Â Hello Xiaozhu,

Â Â Â Â Thank you a lot for your response. I double-checked the gdb output,
Â Â Â Â and I suppose only one piece of instrumentation code is indeed
Â Â Â Â executed.

Â Â Â Â In particular,
Â Â Â Â even tough basic blocks are instrumented like this (please see the
Â Â Â Â jmpq instructions):

Â Â Â Â http://i.stack.imgur.com/Zl0ar.png

Â Â Â Â But actually in the gdb debugging code only on one "addq"
Â Â Â Â instruction is
Â Â Â Â indeed inserted..

Â Â Â Â http://i.stack.imgur.com/NHx7F.png

Â Â Â Â Am I missed anything..?

Â Â You may want to take a look at the code coverage example, available
Â Â here:

Â Â http://www.paradyn.org/html/tools/codecoverage.html

Â Â It's doing both function-level and block-level code coverage.

Â Â Â Â BTW: How can you indeed put all the instrumentation code and
Â Â Â Â original
Â Â Â Â together in one section? IMHO,
Â Â Â Â as you don't have the relocation information in the disassembled
Â Â Â Â output,
Â Â Â Â you actually cannot directly
Â Â Â Â "inlineâ instrumentation code into the original code.. Could you
Â Â Â Â please
Â Â Â Â elaborate a little bit?

Â Â This topic is covered at length in Drew Bernat's Anywhere, Anytime
Â Â Binary Instrumentation paper:

Â Â ftp://ftp.cs.wisc.edu/paradyn/papers/Bernat11AWAT.pdf

Â Â The short version: if we parse the binary sufficiently accurately,
Â Â and we are careful of what we know and what we don't know, we can
Â Â relocate most code safely without compiler-level relocation
Â Â information, and we can tell what's not safe to relocat. It's not
Â Â easy, but it's not impossible either.

Â Â Â Â Thank you a lot for response.

Â Â Â Â Sincerely,
Â Â Â Â Shuai

Â Â Â Â On Sun, Aug 23, 2015 at 1:05 AM, Xiaozhu Meng <mxz297@xxxxxxxxx
Â Â Â Â <mailto:mxz297@xxxxxxxxx>
Â Â Â Â <mailto:mxz297@xxxxxxxxx <mailto:mxz297@xxxxxxxxx>>> wrote:

Â Â Â Â Â Â ÂHi Shuai,

Â Â Â Â Â Â ÂSince you instrumented every basic block of a function,
Â Â Â Â Dyninst would
Â Â Â Â Â Â Ârelocate the whole original function to another section.
Â Â Â Â The relocated
Â Â Â Â Â Â Âfunction would contain both the original code and the
Â Â Â Â instrumentation
Â Â Â Â Â Â Âcode. Therefore, executing all the instructions at the patched
Â Â Â Â Â Â Âsections would actually execute both your instrumentation
Â Â Â Â and the
Â Â Â Â Â Â Âoriginal code. One reason to not jump back immediately after
Â Â Â Â Â Â Âinstrumentation is that executing two extra jumps for each
Â Â Â Â basic block
Â Â Â Â Â Â Âwould significantly slow down the execution.

Â Â Â Â Â Â ÂThanks

Â Â Â Â Â Â Â--Xiaozhu

Â Â Â Â Â Â ÂOn Sat, Aug 22, 2015 at 10:37 PM, Shuai Wang
Â Â Â Â <wangshuai901@xxxxxxxxx <mailto:wangshuai901@xxxxxxxxx>
Â Â Â Â Â Â Â<mailto:wangshuai901@xxxxxxxxx

Â Â Â Â <mailto:wangshuai901@xxxxxxxxx>>> wrote:
Â Â Â Â Â Â Â > Dear list,
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > I basically want to instrument an ELF binary, adding some
Â Â Â Â Â Â Âinstrumentation
Â Â Â Â Â Â Â > code to the beginning of every basic block.Â I use DynInst
Â Â Â Â Â Â Âversion 8.2.1 on
Â Â Â Â Â Â Â > 64-bit Linux platform. I am instrumenting some unstripped
Â Â Â Â Â Â Âbinaries now but I
Â Â Â Â Â Â Â > want to move forward to stripped binaries later.
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > I found some very confusing situation in the
Â Â Â Â instrumented output,
Â Â Â Â Â Â Âcould
Â Â Â Â Â Â Â > anyone educate me on that..? Sorry if it is really a stupid
Â Â Â Â Â Â Âquestion.. Let
Â Â Â Â Â Â Â > me elaborate it here:
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > 1. I insert one instruction to the beginning of every
Â Â Â Â basic block.
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > 2. After instrumentation, I use objdump to check the
Â Â Â Â output, I
Â Â Â Â Â Â Âare assured
Â Â Â Â Â Â Â > that basic blocks' begining instruction(s) have been
Â Â Â Â substituted
Â Â Â Â Â Â Âwith a
Â Â Â Â Â Â Â > "jmp" instruction to the patched section, something like
Â Â Â Â this:
Â Â Â Â Â Â Â >Â Â Â Â Â Â jmpqÂ Â700280 <main_dyninst>
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > 3. I use gdb to go with the execution flow on the
Â Â Â Â instrumented
Â Â Â Â Â Â Âoutput, and I
Â Â Â Â Â Â Â > observed that when execution flow hits the first jmpq
Â Â Â Â instruction
Â Â Â Â Â Â Â(at the
Â Â Â Â Â Â Â > beginning of main function actually), it is redirected
Â Â Â Â to the patched
Â Â Â Â Â Â Â > section.
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > 4. I observed the execution at patched section,
Â Â Â Â including both
Â Â Â Â Â Â Â > instrumentation code, also the replaced instructions at the
Â Â Â Â Â Â Âinstrumentation
Â Â Â Â Â Â Â > point of the original binary. However, to my surprise, the
Â Â Â Â Â Â Âexecution flow
Â Â Â Â Â Â Â > isn't redirected back to the original code section, and
Â Â Â Â it just
Â Â Â Â Â Â Âexecute all
Â Â Â Â Â Â Â > the instructions at the patched sections.Â And as a
Â Â Â Â result, even I
Â Â Â Â Â Â Â > instrumented every basic block, but only instrumentation
Â Â Â Â code at
Â Â Â Â Â Â Âthe first
Â Â Â Â Â Â Â > basic block was indeed executed during runtime.
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > I suppose for a static instrumentation, after execution of
Â Â Â Â Â Â Âinstrumentation
Â Â Â Â Â Â Â > code and replaced instructions at the patched section, the
Â Â Â Â Â Â Âexecution flow is
Â Â Â Â Â Â Â > then redirected back by a jmp instruction to the
Â Â Â Â original code
Â Â Â Â Â Â Âsection. Am I
Â Â Â Â Â Â Â > missed anything here..? Or do I have to configure some
Â Â Â Â options in
Â Â Â Â Â Â Âmy code
Â Â Â Â Â Â Â > for this type of functionality..?
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > Sorry for my disorganized description, am I clear?Â If
Â Â Â Â so, could
Â Â Â Â Â Â Âanyone give
Â Â Â Â Â Â Â > me some help..? I really appreciate that!
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > Sincerely,
Â Â Â Â Â Â Â > Shuai
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â >
Â Â Â Â Â Â Â > _______________________________________________
Â Â Â Â Â Â Â > Dyninst-api mailing list
Â Â Â Â Â Â Â > Dyninst-api@xxxxxxxxxxx <mailto:Dyninst-api@xxxxxxxxxxx>
Â Â Â Â <mailto:Dyninst-api@xxxxxxxxxxx <mailto:Dyninst-api@xxxxxxxxxxx>>
Â Â Â Â Â Â Â > https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Â Â Â Â Â Â Â >

Â Â Â Â _______________________________________________
Â Â Â Â Dyninst-api mailing list
Â Â Â Â Dyninst-api@xxxxxxxxxxx <mailto:Dyninst-api@xxxxxxxxxxx>
Â Â Â Â https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

Â Â --
Â Â --bw

Â Â Bill Williams
Â Â Paradyn Project
Â Â bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>

--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx

[← Prev in Thread]	Current Thread	[Next in Thread→]
[DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Xiaozhu Meng Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Bill Williams Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Bill Williams Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Shuai Wang <=

Previous by Date:	Re: [DynInst_API:] Dyninst 9.0 release!, Bill Williams
Next by Date:	[DynInst_API:] Examples for constructing instrumentation code snippets, Shuai Wang
Previous by Thread:	Re: [DynInst_API:] A question about dynInst's static instrumentation ability, Bill Williams
Next by Thread:	[DynInst_API:] Boost assertion when calling BPatch_point.isDynamic(), Victor van der Veen
Indexes:	[Date] [Thread]

Mailing List Archives

Authenticated access

Re: [DynInst_API:] A question about dynInst's static instrumentation ability