On 05/23/2013 11:33 PM, Marc Brünink wrote:
Hi Bill,
thanks for your answer. I appreciate it very much.
I actually had problems, because I expected that once an edge is marked interprocedual all following edges are also interprocedual. Thus, I was a bit puzzled when I saw that the edge from <+7> to <+9> was classified as interprocedual and the basic block <+9> skipped; however, the basic block at <+73> is still included in the function even though it is reachable via basic block <+9> which was just declared to be not part of the function.
But I forgot that basic blocks might be shared.
So Dyninst creates the following two functions (I suppose; not verified):
F1 F2
x 0x00007ffff7b01890 <+0>: cmpl $0x0,0x2d793d(%rip)
x 0x00007ffff7b01897 <+7>: jne 0x7ffff7b018a9 <read+25>
x 0x00007ffff7b01899 <+9>: mov $0x0,%eax
x 0x00007ffff7b0189e <+14>: syscall
x 0x00007ffff7b018a0 <+16>: cmp $0xfffffffffffff001,%rax
x 0x00007ffff7b018a6 <+22>: jae 0x7ffff7b018d9 <read+73>
x 0x00007ffff7b018a8 <+24>: retq
x 0x00007ffff7b018a9 <+25>: sub $0x8,%rsp
x 0x00007ffff7b018ad <+29>: callq 0x7ffff7b1c9f0
x 0x00007ffff7b018b2 <+34>: mov %rax,(%rsp)
x 0x00007ffff7b018b6 <+38>: mov $0x0,%eax
x 0x00007ffff7b018bb <+43>: syscall
x 0x00007ffff7b018bd <+45>: mov (%rsp),%rdi
x 0x00007ffff7b018c1 <+49>: mov %rax,%rdx
x 0x00007ffff7b018c4 <+52>: callq 0x7ffff7b1ca50
x 0x00007ffff7b018c9 <+57>: mov %rdx,%rax
x 0x00007ffff7b018cc <+60>: add $0x8,%rsp
x 0x00007ffff7b018d0 <+64>: cmp $0xfffffffffffff001,%rax
x 0x00007ffff7b018d6 <+70>: jae 0x7ffff7b018d9 <read+73>
x x 0x00007ffff7b018d8 <+72>: retq
x x 0x00007ffff7b018d9 <+73>: mov 0x2d1540(%rip),%rcx
x x 0x00007ffff7b018e0 <+80>: xor %edx,%edx
x x 0x00007ffff7b018e2 <+82>: sub %rax,%rdx
x x 0x00007ffff7b018e5 <+85>: mov %edx,%fs:(%rcx)
x x 0x00007ffff7b018e8 <+88>: or $0xffffffffffffffff,%rax
x x 0x00007ffff7b018ec <+92>: jmp 0x7ffff7b018d8 <read+72>
But if Dyninst shares basic blocks, I fail to see why the block at <+9> cannot be shared as well. Unless "having a single entry point" means an entry basic block cannot be shared. Is there a technical reason, why the entry basic block cannot be shared with another function? Or is it just that Dyninst first declares <+9> as an entry point of a function and then fails to realise that it is actually a shared block?
That's precisely it; blocks can be shared but entry blocks cannot be
shared. I believe the below is a complete list of how we classify things
in parsing, though I may be missing a corner case or two.
* The entry point of the binary is a function entry point
* Anything with a function symbol pointing to it is a function entry point
* Anything reached by a call instruction that is *not* a getpc call of
some form is a function entry point; getpc calls are elided (as we need
to modify them when we move code)
* Any edge targeting a function entry point is interprocedural
* Any return edge is interprocedural
* Any edge that we believe is a tail call based on stack heuristics is
interprocedural
* A function, then, becomes the set of blocks dominated by an entry
block and reachable without using interprocedural edges
--bw
(I cc'ed the list again, because I think this might be worth archiving; compared to the previous msg which just contained a large tar)
Marc
On May 24, 2013, at 12:02 AM, Bill Williams wrote:
On 05/21/2013 08:38 PM, Marc Brünink wrote:
Output attached. If you need anything else, just let me know.
BTW: setting DYNINST_DEBUG_PARSING=1 leads to a bus error in the mutatee.
#0 0x00007f9594e94ed9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f9593b0cc10 in t_kill (pid=7054, sig=7) at ../src/RTlinux.c:94
#2 0x00007f9593b0d0e3 in DYNINSTbreakPoint () at ../src/RTlinux.c:116
#3 0x00007f9593b0e92b in DYNINST_instExitEntry (arg1=0x0) at
../src/RTcommon.c:399
#4 0x00007f9593da48b8 in DYNINSTstaticHeap_16M_anyHeap_1 () from
/usr/lib/libdyninstAPI_RT.so
#5 0x00007f9594db59f8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000000000 in ?? ()
Okay, I can explain this at least in part, and the parsing is not a bug
but it's not intuitive either.
We found a call from another function targeting 7f17226f7899 (the zero
eax/syscall block). That call causes us to treat that block and the
following return block as its own micro-function (since it's reached by
a call instruction), and all edges from read to that function as
interprocedural. This is a direct consequence of our "functions have
single entry points" abstraction, which has very nice properties for
both analysis and instrumentation, but it can produce confusing results
(as you see here).
If you're just using Dyninst for binary analysis, you may want to open
your binaries in rewriting mode (openBinary rather than attachProcess).
If you're going to work with a running process, in order to exit
cleanly, you'll want something like the following after you're done with
analysis:
do {
process->continueExecution();
bpatch->waitForStatusChange();
} while (!process->isTerminated());
to continue the process with Dyninst still attached to it, or
process->detach(true);
to detach and let it exit cleanly. Otherwise, the mutator won't be
present to handle various bits of instrumentation that we insert into
the mutatee by default (e.g. for exit callbacks) and the mutatee can
crash (as some of that instrumentation includes traps). If you still see
mutatee crashes under DYNINST_DEBUG_PARSING when you're cleaning up
properly, let me know and I'll see if I can get a fix under the wire for
8.1.2.
--bw
Marc
On 21/05/2013 23:19, Bill Williams wrote:
Marc--
That looks like a bug to me. Can you set the environment variable
DYNINST_DEBUG_PARSING to 1, run your test, and send me the output that
produces?
Thanks.
--bw
Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
On 05/21/2013 07:12 AM, Marc Brünink wrote:
Hi,
I just started using Dyninst and have a small question regarding basic
blocks.
I have a micro test program that opens a file and reads some data from
it. I am having issues with the basic blocks of the read function.
Basically I'm missing 2 basic blocks.
Using function.getCFG()->getAllBasicBlocks(bbs) I get the following
basic blocks:
Basic Block (7f17226f7890 to 7f17226f7899) (entry: 1) (exit: 0):
7f17226f7890 cmp [RIP + 2d793d], 0
7f17226f7897 jnz 10 + RIP + 2
Basic Block (7f17226f78a9 to 7f17226f78b2) (entry: 0) (exit: 0):
7f17226f78a9 sub RSP, 8
7f17226f78ad call 1b13e + RIP + 5
Basic Block (7f17226f78b2 to 7f17226f78c9) (entry: 0) (exit: 0):
7f17226f78b2 mov [ESP], RAX
7f17226f78b6 mov RAX, 0
7f17226f78bb syscall RCX
7f17226f78bd mov RDI, [ESP]
7f17226f78c1 mov RDX, RAX
7f17226f78c4 call 1b187 + RIP + 5
Basic Block (7f17226f78c9 to 7f17226f78d8) (entry: 0) (exit: 0):
7f17226f78c9 mov RAX, RDX
7f17226f78cc add RSP, 8
7f17226f78d0 cmp RAX, fffff001
7f17226f78d6 jnb/jae/j 1 + RIP + 2
Basic Block (7f17226f78d8 to 7f17226f78d9) (entry: 0) (exit: 1):
7f17226f78d8 ret near [RSP]
Basic Block (7f17226f78d9 to 7f17226f78ee) (entry: 0) (exit: 0):
7f17226f78d9 mov RCX, [RIP + 2d1540]
7f17226f78e0 xor RDX, RDX
7f17226f78e2 sub RDX, RAX
7f17226f78e5 mov [RCX], RDX
7f17226f78e8 or RAX, ff
7f17226f78ec jmp ffffffffffffffea + RIP + 2
Using GDB I get this:
0x00007ffff7b01890 <+0>: cmpl $0x0,0x2d793d(%rip) #
0x7ffff7dd91d4
=> 0x00007ffff7b01897 <+7>: jne 0x7ffff7b018a9 <read+25>
0x00007ffff7b01899 <+9>: mov $0x0,%eax
0x00007ffff7b0189e <+14>: syscall
0x00007ffff7b018a0 <+16>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b018a6 <+22>: jae 0x7ffff7b018d9 <read+73>
0x00007ffff7b018a8 <+24>: retq
0x00007ffff7b018a9 <+25>: sub $0x8,%rsp
0x00007ffff7b018ad <+29>: callq 0x7ffff7b1c9f0
0x00007ffff7b018b2 <+34>: mov %rax,(%rsp)
0x00007ffff7b018b6 <+38>: mov $0x0,%eax
0x00007ffff7b018bb <+43>: syscall
0x00007ffff7b018bd <+45>: mov (%rsp),%rdi
0x00007ffff7b018c1 <+49>: mov %rax,%rdx
0x00007ffff7b018c4 <+52>: callq 0x7ffff7b1ca50
0x00007ffff7b018c9 <+57>: mov %rdx,%rax
0x00007ffff7b018cc <+60>: add $0x8,%rsp
0x00007ffff7b018d0 <+64>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b018d6 <+70>: jae 0x7ffff7b018d9 <read+73>
0x00007ffff7b018d8 <+72>: retq
0x00007ffff7b018d9 <+73>: mov 0x2d1540(%rip),%rcx #
0x7ffff7dd2e20
0x00007ffff7b018e0 <+80>: xor %edx,%edx
0x00007ffff7b018e2 <+82>: sub %rax,%rdx
0x00007ffff7b018e5 <+85>: mov %edx,%fs:(%rcx)
0x00007ffff7b018e8 <+88>: or $0xffffffffffffffff,%rax
0x00007ffff7b018ec <+92>: jmp 0x7ffff7b018d8 <read+72>
So I am basically missing the 2 blocks starting at 0x00007ffff7b01899
and 0x00007ffff7b018a8.
The edge between 0x00007ffff7b01890 and 0x00007ffff7b01899 is classified
as an interprocedual tail call (why?). Shouldn't the block be still part
of the function?
Marc
_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
--
--bw
Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
--
--bw
Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
|