Dear list,
I am writing to ask how to use DynInst to recognize function
entry points (memory addresses) in stripped binaries.
I successfully installed the 32-bit DynInst 9.10, and I use
a DynInst script to iterate all the functions with the
following commands to dump all the function entry point
addresses from stripped binaries.
.......
vector<BPatch_module *> *
modules = appImage->getModules();
......
vector<BPatch_function *> *
funcs = (*module_iter)->getProcedures();
vector<BPatch_function
*>::iterator func_iter;
for(func_iter = funcs->begin();
func_iter != funcs->end(); ++func_iter) {
char functionName[1024];
(*func_iter)->getName(functionName, 1024);
cout << "-- Function : "
<< functionName << " --" << endl;
......
I extract the function entry point addresses from the
function names.
I test some LLVM compiler CoreUtil binaries with O2
optimization level. And the precision/recall rate is general
very good! Precision: 0.99; Recall: 0.91
According to this
paper,
Section 6.2, on average DynInst can have over 0.97 precision,
and 0.93 recall on 32-bit ELF binaries. It is very consistent
with my test! But still, I am not sure whether I did
everything correct.
So here are my questions:
1. It seems that by leveraging machine learning method to
recognize functions, DynInst needs a training process before
recognition, but I didn't do any training (although the
results are pretty good), is there anything in particular I
have to do before using DynInst?