On 06/30/2015 11:41 AM, Tony Zhang wrote:
Hi Bill,
Thanks for the prompt response.
We have a few follow up questions, if you could please help answer them :
1. Why the two different APIs for this? Is there a significant
performance benefit to opening the object file from memory vs disk?
No. If I recall correctly, this was first implemented to handle
self-unpacking code, where there is no disk representation but we might
like to pretend that there's an "object file" anyway. About 95% of the
time (closer to 99% if you don't already know you want the memory
interface) the disk interface is strictly superior; the memory interface
will lose a ton of information that lives on disk but doesn't get mapped
in when a binary is loaded.
2. We’re not quite sure what you meant by “mapped region in a currently
running process” - is this something that’s part of the ELF format? If
you could give us a brief explanation or some resources we could use to
understand this, that would be great.
No, it's not part of ELF. What I mean is if you're looking at an
existing process (yours, or someone else's through shared memory), and
read /proc/maps to find where a given object file was loaded, you could
point Symtab at that region of memory and treat it as an object file.
Generally not terribly useful, but you can if you want to.
3. In the file open API signature : static bool openFile(Symtab *&obj,
char *mem_image, size_t size, std::string name) - we’re not quite sure
what “mem_image” is supposed to be. The doc says that “mem_image
represents the pointer to the Object file in memory to be parsed.” -
where does this information come from? The first version of the API has
“string filename” to be loaded from disk, but we’re not quite sure where
mem_image is obtained from.
Take the buffer represented by mem_image and size, assign it a name, and
treat it as if you found it on disk. This always comes from some
external knowledge that mem_image points to something you want to treat
as an object file--you could, for instance, use this to examine a file
that's come in over a network before writing it to disk. Or for
dynamically unpacked/JITed code
4. Could you point us to a few example programs that use symtabAPI
(similar to the rich set of dyninstAPI examples)?
We don't have a ton of Symtab examples around (though there are many
projects that use Symtab for various purposes); I think there are a
couple in the Symtab manual. The Symtab test cases (in the separate test
suite repository) and many of the other Dyninst components (including
Dyninst itself) provide lots of examples of use, though. If you've got
specific interfaces you want to know about, ask and I can probably point
you in the right direction.
Thank you again,
Tony and Srihari.
On Jun 30, 2015, at 12:14 PM, Bill Williams <bill@xxxxxxxxxxx
<mailto:bill@xxxxxxxxxxx>> wrote:
Somewhat more precisely, on disk = object file; in memory = already
mmapped for whatever reason. This could be a mapped region in a
currently running process, or because your tool already mapped in the
object file for other reasons, or because you're pointing Symtab at
dynamically generated ELF-formatted stuff before it hits disk.
--
--bw
Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
|