Re: [DynInst_API:] [SymtabAPI] Question regarding object files


Date: Tue, 30 Jun 2015 12:21:16 -0500
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] [SymtabAPI] Question regarding object files
On 06/30/2015 11:41 AM, Tony Zhang wrote:
Hi Bill,

Thanks for the prompt response.

We have a few follow up questions, if you could please help answer them :

1. Why the two different APIs for this? Is there a significant
performance benefit to opening the object file from memory vs disk?

No. If I recall correctly, this was first implemented to handle self-unpacking code, where there is no disk representation but we might like to pretend that there's an "object file" anyway. About 95% of the time (closer to 99% if you don't already know you want the memory interface) the disk interface is strictly superior; the memory interface will lose a ton of information that lives on disk but doesn't get mapped in when a binary is loaded.

2. We’re not quite sure what you meant by “mapped region in a currently
running process” - is this something that’s part of the ELF format? If
you could give us a brief explanation or some resources we could use to
understand this, that would be great.

No, it's not part of ELF. What I mean is if you're looking at an existing process (yours, or someone else's through shared memory), and read /proc/maps to find where a given object file was loaded, you could point Symtab at that region of memory and treat it as an object file. Generally not terribly useful, but you can if you want to.

3. In the file open API signature : static bool openFile(Symtab *&obj,
char *mem_image, size_t size,  std::string name) - we’re not quite sure
what “mem_image” is supposed to be. The doc says that “mem_image
represents the pointer to the Object file in memory to be parsed.” -
where does this information come from? The first version of the API has
“string filename” to be loaded from disk, but we’re not quite sure where
mem_image is obtained from.

Take the buffer represented by mem_image and size, assign it a name, and treat it as if you found it on disk. This always comes from some external knowledge that mem_image points to something you want to treat as an object file--you could, for instance, use this to examine a file that's come in over a network before writing it to disk. Or for dynamically unpacked/JITed code

4. Could you point us to a few example programs that use symtabAPI
(similar to the rich set of dyninstAPI examples)?

We don't have a ton of Symtab examples around (though there are many projects that use Symtab for various purposes); I think there are a couple in the Symtab manual. The Symtab test cases (in the separate test suite repository) and many of the other Dyninst components (including Dyninst itself) provide lots of examples of use, though. If you've got specific interfaces you want to know about, ask and I can probably point you in the right direction.

Thank you again,
Tony and Srihari.


On Jun 30, 2015, at 12:14 PM, Bill Williams <bill@xxxxxxxxxxx
<mailto:bill@xxxxxxxxxxx>> wrote:

Somewhat more precisely, on disk = object file; in memory = already
mmapped for whatever reason. This could be a mapped region in a
currently running process, or because your tool already mapped in the
object file for other reasons, or because you're pointing Symtab at
dynamically generated ELF-formatted stuff before it hits disk.


--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]