Re: [DynInst_API:] where to find the code for handling switch() statements?


Date: Wed, 23 Aug 2017 09:36:42 +0200
From: Thomas Dullien <thomasdullien@xxxxxxxxxx>
Subject: Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there,

ok, I identified one of my mistakes that led to poor performance on switch()es -- I had marked the code section
as writeable, and the switch handling heuristics then declare all switch jumptables invalid. Fixing this fixed
my issues with the switch code above :)

Cheers,
Thomas

On Tue, Aug 22, 2017 at 4:06 PM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey there,

regarding 64-bit PE binaries: I am providing the data to Dyninst myself, so anything that works "disassembly-wise" should work here, too. In the
end, to Dyninst, the code is just a blob of assembly.

Thanks a lot for the hints regarding the env var and the code, digging into it now :-)

Cheers,
Thomas

On Tue, Aug 22, 2017 at 3:33 PM, Xiaozhu Meng <mxz297@xxxxxxxxx> wrote:
Hi Thomas,

While Dyninst fully supports 64-bit ELF binaries, I don't think Dyninst currently work with 64-bit PE binaries. I need to ask others to know how much efforts are needed if you really want to do analyze 64-bit PE binaries.

In terms of your 32-bit code example, the jump table construct looks very primitive, so I am a little surprise that Dyninst currently failed to analyze it.Â

To debug this, you can first set "DYNINST_DEBUG_PARSING" to 1 and then run your program again. This will dump the complete debugging log. In terms of the code, you want to start with parseAPI/src/IndirectAnalyzer.C, which performs the analysis of the jump tables. It contains two major pieces: parseAPI/src/JumpTableFormatPred.C, which contains the code to determine the jump table locations, jump table index variables, and other format elements, and parseAPI/src/JumpTableIndexPred.C, which tries the determine the value bound of the index variables.Â

In your case, I am guessing that the problem is in JumpTableFormatPred.C.

If you find it difficult to debug this by your own and if it is possible to share this problematic binary with me, I can take a look at it.

Thanks,

--Xiaozhu

On Tue, Aug 22, 2017 at 7:50 AM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey there,

I gave the fork a try, but it does not seem to have handled the switch I encounter either. The construct looks
as follows:

.text:5A6E59FA         push  Âebp
.text:5A6E59FB         mov   ebp, esp
.text:5A6E59FD         sub   esp, 18h
.text:5A6E5A00         imul  Âeax, [ebp+arg_4], 28h
.text:5A6E5A04         push  Âebx
.text:5A6E5A05         mov   ebx, [ebp+arg_0]
.text:5A6E5A08         push  Âesi
.text:5A6E5A09         mov   esi, ecx
.text:5A6E5A0B         mov   [ebp+var_8], 17D7840h
.text:5A6E5A12         add   eax, ebx
.text:5A6E5A14         mov   [ebp+var_14], esi
.text:5A6E5A17         mov   [ebp+var_C], ebx
.text:5A6E5A1A         mov   [ebp+var_18], eax
.text:5A6E5A1D         push  Âedi
.text:5A6E5A1E         cmp   ebx, eax
.text:5A6E5A20         jnb   loc_5A6E608A
.text:5A6E5A26         lea   eax, [ebx+8]
.text:5A6E5A29         mov   ecx, esi
.text:5A6E5A2B         push  Âeax
.text:5A6E5A2C         call  Â(..)
.text:5A6E5A31         mov   edi, eax
.text:5A6E5A33         lea   eax, [ebx+18h]
.text:5A6E5A36         push  Âeax
.text:5A6E5A37         call  Â(...)
.text:5A6E5A3C         mov   ecx, eax
.text:5A6E5A3E         mov   eax, [ebx]
.text:5A6E5A40         cmp   eax, 36h    Â; switch 55 cases
.text:5A6E5A43         ja   Âloc_5A6E6095  Â; jumptable 5A6E5A49 default case
.text:5A6E5A49         jmp   ds:off_5A6E609A[eax*4] ; switch jump

Any advice on where in the dyninst codebase I should go digging for the switch handling code?

Cheers,
Thomas

On Tue, Aug 22, 2017 at 1:26 PM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey there,

an example from 32-bit code where the default switch handling fails:

.text:00412990         sub   esp, 50h
.text:00412993         mov   eax, ___security_cookie
.text:00412998         xor   eax, esp
.text:0041299A         mov   [esp+50h+var_4], eax
.text:0041299E         mov   edx, [esp+50h+arg_0]
.text:004129A2         push  Âebx
.text:004129A3         mov   ebx, ecx
.text:004129A5         lea   eax, [edx-1]
.text:004129A8         cmp   eax, 6     Â; switch 7 cases
.text:004129AB         ja   Âloc_412F7E   Â; jumptable 004129B4 default case
.text:004129B1         push  Âebp
.text:004129B2         push  Âesi
.text:004129B3         push  Âedi
.text:004129B4         jmp   ds:off_412F90[eax*4] ; switch jump

Enough of this for the moment, though :-)) -- I will check your branch now :-)

Cheers,
Thomas

On Tue, Aug 22, 2017 at 1:24 PM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey there,

I am back at work on this :-).Â

A few questions:
Â- Your fork is a fork of Dyninst 9 ?
Â- Are there any things I need to be aware of when building it?

The particular scenario I am dealing with right now is the following construct (x86_64 disassembly of
Visual Studio compiled code).

.text:000000014004D970         mov   [rsp+arg_8], edx
.text:000000014004D974         mov   [rsp+arg_0], rcx
.text:000000014004D979         push  Ârdi
.text:000000014004D97A         sub   rsp, 220h
.text:000000014004D981         mov   rdi, rsp
.text:000000014004D984         mov   ecx, 88h
.text:000000014004D989         mov   eax, 0CCCCCCCCh
.text:000000014004D98E Â Â Â Â Â Â Â Â rep stosd
.text:000000014004D990         mov   rcx, [rsp+228h+arg_0]
.text:000000014004D998         mov   rax, cs:__security_cookie
.text:000000014004D99F         xor   rax, rsp
.text:000000014004D9A2         mov   [rsp+228h+var_18], rax
.text:000000014004D9AA         mov   eax, [rsp+228h+arg_8]
.text:000000014004D9B1         mov   [rsp+228h+var_80], eax
.text:000000014004D9B8         mov   eax, [rsp+228h+var_80]
.text:000000014004D9BF         dec   eax
.text:000000014004D9C1         mov   [rsp+228h+var_80], eax
.text:000000014004D9C8         cmp   [rsp+228h+var_80], 5 ; switch 6 cases
.text:000000014004D9D0         ja   Âloc_14004EA48  ; jumptable 000000014004D9EF default case
.text:000000014004D9D6 Â Â Â Â Â Â Â Â movsxd Ârax, [rsp+228h+var_80]
.text:000000014004D9DE         lea   rcx, cs:140000000h
.text:000000014004D9E5         mov   eax, ds:(off_14004EA70 - 140000000h)[rcx+rax*4]
.text:000000014004D9EC         add   rax, rcx
.text:000000014004D9EF         jmp   rax       ; switch jump
.text:000000014004D9F1 ; ---------------------------------------------------------------------------

Cheers,
Thomas

On Tue, Jun 13, 2017 at 4:35 PM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey there,

excellent, thanks for your quick response :-) I will give your fork a try in the next 2-3 days -- I am currently
at a conference and hence won't have time to try it today :-)

Cheers,
Thomas

On Tue, Jun 13, 2017 at 10:30 AM, Xiaozhu Meng <mxz297@xxxxxxxxx> wrote:
Hi Thomas,Â

I am working with an improved jump table analysis. Its prototype is available at my Dyninst fork (https://github.com/mxz297/dyninst/tree/jump_table_multi_slices). This improved version should be merged back to mainstream Dyninst in the near future. Could you try my version to see whether it solves your problem? If the problem remains, could you provide me the problematic binary so that I can further improve my code?

Thanks,

--Xiaozhu

On Tue, Jun 13, 2017 at 7:25 AM, Thomas Dullien <thomasdullien@xxxxxxxxxx> wrote:
Hey all,

I am using DynInst for a small project that helps search for similar
and noticed that most switch statements that it encounters are not
handled properly (e.g. the control flow reconstruction fails to resolve
the switch targets).

Where in the source code should I go looking for the relevant code?
I'd love to have a look around to see if it can be improved.

Cheers,
Thomas

_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api









[← Prev in Thread] Current Thread [Next in Thread→]