Re: [DynInst_API:] Dyninst for embedded cross-arch


Date: Wed, 13 Jul 2016 11:50:46 -0500
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] Dyninst for embedded cross-arch
On 07/13/2016 05:39 AM, Rafael·Stahl wrote:
Hi all,

Dyninst has caught our interest at Technical University of Munich Institute for Electronic Design Automation because of its many features of which symbolic execution and static function analysis are most interesting to us.

Our project is about source-binary mapping for timing analysis of embedded software in the field of design automation. For this we need basic blocks and a control flow graph. These are currently generated by hand and struggle with complicated code constructs, indirect jumps and generally optimized code.

One change we will have to make is to allow binaries to be loaded even when they are not compiled for the same architecture as the host (always x86). I already poked around in the code a bit and it seems to be possible with the existing abstractions. The main features of Dyninst like dynamic analysis and instrumentation/rewriting will obviously not work, but those are of no interest to us anyway. Is there anything that would make the cross-arch analysis task difficult?

Cross-architecture is actually pretty easy. The problem, such as it is, is one of connecting the architecture information from the binary to ParseAPI and from there back down to InstructionAPI, rather than having ParseAPI assume that it's analyzing same-arch binaries. Cross-file-format is harder, and as you note below, analyzing Linux on Windows is comparatively easy. Analyzing Windows on Linux is harder, unless there's been an implementation of a cross-platform version of dbghelp that I haven't heard about.
Then of course the architectures of interest need to be implemented. Currently these are ARM (32-bit) and OpenRisc. As far as I can tell this would at least be the register file, helpers to define register semantics and the InstructionDecoder. Apart from that some smaller/optional stuff like jump table recognition. Did I miss something important here?

I think that pretty well covers it; the jump table recognition also depends on instruction semantics that we currently share with the ROSE tool, and those may be already implemented (I believe 32-bit ARM is) or not. Are you currently looking at file formats other than ELF and PE? Those would also need implementation or cross-checking for bugs--we technically handle xcoff but that hasn't been tested since we dropped AIX support years ago.
We would also need to target Windows, but in your Readme you write that only the rewriter is not implemented on Windows. On the other hand, our cross-arch scenario would add dependencies to libiberty, libelf and libdwarf. MinGW seems to come with libiberty and the others seem to compile on Windows. Can you think of anything else problematic here?

I'd want to ensure that cross-arch is a configurable option, rather than a hard requirement. If you're just doing analysis I'd build directly on top of ParseAPI/DataflowAPI (both in the parseAPI shared object, because of circular dependencies otherwise).
How do you overall estimate the feasibility of our changes and the use of Dyninst for bare-metal embedded applications? Any tips where to be careful or locations to start would be greatly appreciated.

I think the above should cover most of the highlights, including good starting points. Many of the things you want to do have been on my "when time allows" list for quite a while, and I'm excited to see someone picking them up. When you get to instruction decoding, Sunny Shah and I can share our experiences with building InstructionDecoders and the associated tables; Sunny's done the ARMv8 64-bit work.

Feel free to post your work in progress as issues and pull requests on our main github repo: https://github.com/dyninst/dyninst. Each individual piece of making Dyninst more cross-platform capable (and each new platform that we add analysis capabilities for) is a valuable thing; you don't need to hand us a giant patch for a finished product all at once.

--bw

Regards
Rafael Stahl
_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

[← Prev in Thread] Current Thread [Next in Thread→]