Re: [DynInst_API:] getType() for Expression/InstructionAPI?


Date: Sat, 24 Nov 2012 19:59:19 -0600
From: Andrew Bernat <bernat@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] getType() for Expression/InstructionAPI?
You can get all of that, though some bits aren't entirely obvious. 

Mnemonic: 
  insn->getOperation().getID(); returns an entryID that maps (roughly) to the mnemonic. 
Number of operands:
  insn->getOperands(...); like you have below. Note that we represent all operands, including implicit ones, so e.g. push will have more operands than are normally shown. 
Type of operand: 
  You'll want to use a visitor class on the _expression_ for this one. If you're not familiar with the paradigm, a visitor is a small class with a method, "visit", that is overloaded for each type of node in a tree (in this case, BinaryFunction, Dereference, Immediate, and RegisterAST). InstructionAPI then applies a provided instance of the visitor class to the _expression_. So you'd want something like:

struct OperandType : public Dyninst::InstructionAPI::Visitor {
    virtual void visit(Dyninst::InstructionAPI::BinaryFunction *) {};
    virtual void visit(Dyninst::InstructionAPI::Dereference *) { memRef = true; }
    virtual void visit(Dyninst::InstructionAPI::Immediate *) { imm = true; }
    virtual void visit(Dyninst::InstructionAPI::RegisterAST *) { reg = true; }
    OperandType() : memRef(false), imm(false), reg(false) {};
    bool memRef, imm, reg;
};

Given an _expression_ expr (as below), you'd then do:
  OperandType o;
  expr->visit(&o);
  if (o.memRef) cerr << "Memory reference" << endl;
  if (o.imm) cerr << "Immediate used" << endl;
  if (o.reg) cerr << "Register used" << endl;

Note that that's probably not exactly what you want, as it's possible to return all three (e.g., dereference of (eax + 4)). We'll traverse the tree bottom up, so you could have the dereference set the register and immediate booleans to false; it depends on what you want to return. 

Finally, size of operand. Each _expression_ has a size() method that should give you what you want. It's in bytes. 

Drew    

On Nov 24, 2012, at 6:46 AM, Wei Ming Khoo <weimzz@xxxxxxxxx> wrote:

Hi,

I'm trying to extract 4 pieces of information from a disassembly (assume 32bit x86 for now): mnemonic, number of operands, operand (register, memory reference, immediate value), and operation width (8/16/32 bit).

I got as far as

Instruction::Ptr insn;
...
Operation operation = insn->getOperation();
std::vector<Operand> operands;
insn->getOperands(operands);

for (int i=0; i<operands.size(); i++){
            _expression_::Ptr expr;
            expr = insn->getOperand(i).getValue();
            std::vector<_expression_::Ptr> children;
            expr->getChildren(children);
            printf("%s %d ", expr->format().c_str(), children.size() );
}

Here's where I got stuck. How do I distinguish between a register, memory reference and an immediate? An ugly hack would be look at the expr->format().c_str() string to decide, but I was wondering if there was a more elegant way to do this?

Regards,
--wm
_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api




[← Prev in Thread] Current Thread [Next in Thread→]