Hi Bill,
thanks for your answer. I appreciate it very much.
I actually had problems, because I expected that once an edge is marked interprocedual all following edges are also interprocedual. Thus, I was a bit puzzled when I saw that the edge from <+7> to <+9> was classified as interprocedual and the basic block <+9> skipped; however, the basic block at <+73> is still included in the function even though it is reachable via basic block <+9> which was just declared to be not part of the function.
But I forgot that basic blocks might be shared.
So Dyninst creates the following two functions (I suppose; not verified):
F1 F2
x 0x00007ffff7b01890 <+0>: cmpl $0x0,0x2d793d(%rip)
x 0x00007ffff7b01897 <+7>: jne 0x7ffff7b018a9 <read+25>
x 0x00007ffff7b01899 <+9>: mov $0x0,%eax
x 0x00007ffff7b0189e <+14>: syscall
x 0x00007ffff7b018a0 <+16>: cmp $0xfffffffffffff001,%rax
x 0x00007ffff7b018a6 <+22>: jae 0x7ffff7b018d9 <read+73>
x 0x00007ffff7b018a8 <+24>: retq
x 0x00007ffff7b018a9 <+25>: sub $0x8,%rsp
x 0x00007ffff7b018ad <+29>: callq 0x7ffff7b1c9f0
x 0x00007ffff7b018b2 <+34>: mov %rax,(%rsp)
x 0x00007ffff7b018b6 <+38>: mov $0x0,%eax
x 0x00007ffff7b018bb <+43>: syscall
x 0x00007ffff7b018bd <+45>: mov (%rsp),%rdi
x 0x00007ffff7b018c1 <+49>: mov %rax,%rdx
x 0x00007ffff7b018c4 <+52>: callq 0x7ffff7b1ca50
x 0x00007ffff7b018c9 <+57>: mov %rdx,%rax
x 0x00007ffff7b018cc <+60>: add $0x8,%rsp
x 0x00007ffff7b018d0 <+64>: cmp $0xfffffffffffff001,%rax
x 0x00007ffff7b018d6 <+70>: jae 0x7ffff7b018d9 <read+73>
x x 0x00007ffff7b018d8 <+72>: retq
x x 0x00007ffff7b018d9 <+73>: mov 0x2d1540(%rip),%rcx
x x 0x00007ffff7b018e0 <+80>: xor %edx,%edx
x x 0x00007ffff7b018e2 <+82>: sub %rax,%rdx
x x 0x00007ffff7b018e5 <+85>: mov %edx,%fs:(%rcx)
x x 0x00007ffff7b018e8 <+88>: or $0xffffffffffffffff,%rax
x x 0x00007ffff7b018ec <+92>: jmp 0x7ffff7b018d8 <read+72>
But if Dyninst shares basic blocks, I fail to see why the block at <+9> cannot be shared as well. Unless "having a single entry point" means an entry basic block cannot be shared. Is there a technical reason, why the entry basic block cannot be shared with another function? Or is it just that Dyninst first declares <+9> as an entry point of a function and then fails to realise that it is actually a shared block?
(I cc'ed the list again, because I think this might be worth archiving; compared to the previous msg which just contained a large tar)
Marc
On May 24, 2013, at 12:02 AM, Bill Williams wrote:
> On 05/21/2013 08:38 PM, Marc Brünink wrote:
>> Output attached. If you need anything else, just let me know.
>>
>> BTW: setting DYNINST_DEBUG_PARSING=1 leads to a bus error in the mutatee.
>>
>> #0 0x00007f9594e94ed9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6
>> #1 0x00007f9593b0cc10 in t_kill (pid=7054, sig=7) at ../src/RTlinux.c:94
>> #2 0x00007f9593b0d0e3 in DYNINSTbreakPoint () at ../src/RTlinux.c:116
>> #3 0x00007f9593b0e92b in DYNINST_instExitEntry (arg1=0x0) at
>> ../src/RTcommon.c:399
>> #4 0x00007f9593da48b8 in DYNINSTstaticHeap_16M_anyHeap_1 () from
>> /usr/lib/libdyninstAPI_RT.so
>> #5 0x00007f9594db59f8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>> #6 0x0000000000000000 in ?? ()
>>
> Okay, I can explain this at least in part, and the parsing is not a bug
> but it's not intuitive either.
>
> We found a call from another function targeting 7f17226f7899 (the zero
> eax/syscall block). That call causes us to treat that block and the
> following return block as its own micro-function (since it's reached by
> a call instruction), and all edges from read to that function as
> interprocedural. This is a direct consequence of our "functions have
> single entry points" abstraction, which has very nice properties for
> both analysis and instrumentation, but it can produce confusing results
> (as you see here).
>
> If you're just using Dyninst for binary analysis, you may want to open
> your binaries in rewriting mode (openBinary rather than attachProcess).
> If you're going to work with a running process, in order to exit
> cleanly, you'll want something like the following after you're done with
> analysis:
>
> do {
> process->continueExecution();
> bpatch->waitForStatusChange();
> } while (!process->isTerminated());
>
> to continue the process with Dyninst still attached to it, or
>
> process->detach(true);
>
> to detach and let it exit cleanly. Otherwise, the mutator won't be
> present to handle various bits of instrumentation that we insert into
> the mutatee by default (e.g. for exit callbacks) and the mutatee can
> crash (as some of that instrumentation includes traps). If you still see
> mutatee crashes under DYNINST_DEBUG_PARSING when you're cleaning up
> properly, let me know and I'll see if I can get a fix under the wire for
> 8.1.2.
>
> --bw
>
>>
>> Marc
>>
>>
>> On 21/05/2013 23:19, Bill Williams wrote:
>>> Marc--
>>>
>>> That looks like a bug to me. Can you set the environment variable
>>> DYNINST_DEBUG_PARSING to 1, run your test, and send me the output that
>>> produces?
>>>
>>> Thanks.
>>>
>>> --bw
>>>
>>> Bill Williams
>>> Paradyn Project
>>> bill@xxxxxxxxxxx
>>>
>>> On 05/21/2013 07:12 AM, Marc Brünink wrote:
>>>> Hi,
>>>>
>>>> I just started using Dyninst and have a small question regarding basic
>>>> blocks.
>>>>
>>>> I have a micro test program that opens a file and reads some data from
>>>> it. I am having issues with the basic blocks of the read function.
>>>> Basically I'm missing 2 basic blocks.
>>>>
>>>> Using function.getCFG()->getAllBasicBlocks(bbs) I get the following
>>>> basic blocks:
>>>>
>>>> Basic Block (7f17226f7890 to 7f17226f7899) (entry: 1) (exit: 0):
>>>> 7f17226f7890 cmp [RIP + 2d793d], 0
>>>> 7f17226f7897 jnz 10 + RIP + 2
>>>> Basic Block (7f17226f78a9 to 7f17226f78b2) (entry: 0) (exit: 0):
>>>> 7f17226f78a9 sub RSP, 8
>>>> 7f17226f78ad call 1b13e + RIP + 5
>>>> Basic Block (7f17226f78b2 to 7f17226f78c9) (entry: 0) (exit: 0):
>>>> 7f17226f78b2 mov [ESP], RAX
>>>> 7f17226f78b6 mov RAX, 0
>>>> 7f17226f78bb syscall RCX
>>>> 7f17226f78bd mov RDI, [ESP]
>>>> 7f17226f78c1 mov RDX, RAX
>>>> 7f17226f78c4 call 1b187 + RIP + 5
>>>> Basic Block (7f17226f78c9 to 7f17226f78d8) (entry: 0) (exit: 0):
>>>> 7f17226f78c9 mov RAX, RDX
>>>> 7f17226f78cc add RSP, 8
>>>> 7f17226f78d0 cmp RAX, fffff001
>>>> 7f17226f78d6 jnb/jae/j 1 + RIP + 2
>>>> Basic Block (7f17226f78d8 to 7f17226f78d9) (entry: 0) (exit: 1):
>>>> 7f17226f78d8 ret near [RSP]
>>>> Basic Block (7f17226f78d9 to 7f17226f78ee) (entry: 0) (exit: 0):
>>>> 7f17226f78d9 mov RCX, [RIP + 2d1540]
>>>> 7f17226f78e0 xor RDX, RDX
>>>> 7f17226f78e2 sub RDX, RAX
>>>> 7f17226f78e5 mov [RCX], RDX
>>>> 7f17226f78e8 or RAX, ff
>>>> 7f17226f78ec jmp ffffffffffffffea + RIP + 2
>>>>
>>>>
>>>> Using GDB I get this:
>>>>
>>>> 0x00007ffff7b01890 <+0>: cmpl $0x0,0x2d793d(%rip) #
>>>> 0x7ffff7dd91d4
>>>> => 0x00007ffff7b01897 <+7>: jne 0x7ffff7b018a9 <read+25>
>>>> 0x00007ffff7b01899 <+9>: mov $0x0,%eax
>>>> 0x00007ffff7b0189e <+14>: syscall
>>>> 0x00007ffff7b018a0 <+16>: cmp $0xfffffffffffff001,%rax
>>>> 0x00007ffff7b018a6 <+22>: jae 0x7ffff7b018d9 <read+73>
>>>> 0x00007ffff7b018a8 <+24>: retq
>>>> 0x00007ffff7b018a9 <+25>: sub $0x8,%rsp
>>>> 0x00007ffff7b018ad <+29>: callq 0x7ffff7b1c9f0
>>>> 0x00007ffff7b018b2 <+34>: mov %rax,(%rsp)
>>>> 0x00007ffff7b018b6 <+38>: mov $0x0,%eax
>>>> 0x00007ffff7b018bb <+43>: syscall
>>>> 0x00007ffff7b018bd <+45>: mov (%rsp),%rdi
>>>> 0x00007ffff7b018c1 <+49>: mov %rax,%rdx
>>>> 0x00007ffff7b018c4 <+52>: callq 0x7ffff7b1ca50
>>>> 0x00007ffff7b018c9 <+57>: mov %rdx,%rax
>>>> 0x00007ffff7b018cc <+60>: add $0x8,%rsp
>>>> 0x00007ffff7b018d0 <+64>: cmp $0xfffffffffffff001,%rax
>>>> 0x00007ffff7b018d6 <+70>: jae 0x7ffff7b018d9 <read+73>
>>>> 0x00007ffff7b018d8 <+72>: retq
>>>> 0x00007ffff7b018d9 <+73>: mov 0x2d1540(%rip),%rcx #
>>>> 0x7ffff7dd2e20
>>>> 0x00007ffff7b018e0 <+80>: xor %edx,%edx
>>>> 0x00007ffff7b018e2 <+82>: sub %rax,%rdx
>>>> 0x00007ffff7b018e5 <+85>: mov %edx,%fs:(%rcx)
>>>> 0x00007ffff7b018e8 <+88>: or $0xffffffffffffffff,%rax
>>>> 0x00007ffff7b018ec <+92>: jmp 0x7ffff7b018d8 <read+72>
>>>>
>>>>
>>>> So I am basically missing the 2 blocks starting at 0x00007ffff7b01899
>>>> and 0x00007ffff7b018a8.
>>>>
>>>> The edge between 0x00007ffff7b01890 and 0x00007ffff7b01899 is classified
>>>> as an interprocedual tail call (why?). Shouldn't the block be still part
>>>> of the function?
>>>>
>>>> Marc
>>>> _______________________________________________
>>>> Dyninst-api mailing list
>>>> Dyninst-api@xxxxxxxxxxx
>>>> https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
>>>
>>>
>>>
>>
>
>
> --
> --bw
>
> Bill Williams
> Paradyn Project
> bill@xxxxxxxxxxx
|