Re: [DynInst_API:] DynInst Overhead


Date: Mon, 21 Jul 2014 15:55:30 -0400
From: Buddhika Chamith Kahawitage Don <budkahaw@xxxxxxxxxxxx>
Subject: Re: [DynInst_API:] DynInst Overhead
My earlier mail is being held for the moderator approval. Anyway let me just paste a small snippet from the output. Hope that should be enough.


createRelocSpringboards for 400dd6
Looking for addr b7fb96 in function _init
getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
Adding branch: 400dd6 -> 400ddb
ÂÂÂ ÂInserting taken space 400dd6 -> 400ddb /w/ range 0
Generated springboard branch 400dd1->b7fafe
Conflict called for 400dd1->400dd6
ÂÂÂ Âlooking for 400dd1
ÂÂÂ ÂÂÂ ÂFound 400dd1 -> 400dd6 /w/ state 1e
ÂÂÂ ÂNo conflict, we're good
createRelocSpringboards for 400dd1
Looking for addr b7fafe in function _init
getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
Adding branch: 400dd1 -> 400dd6
ÂÂÂ ÂInserting taken space 400dd1 -> 400dd6 /w/ range 0
Installing 15980 springboards!



On Mon, Jul 21, 2014 at 3:41 PM, Buddhika Chamith Kahawitage Don <budkahaw@xxxxxxxxxxxx> wrote:
Please find the output in attached file.

Regards
Bud


On Mon, Jul 21, 2014 at 3:13 PM, Bill Williams <bill@xxxxxxxxxxx> wrote:
On 07/21/2014 01:59 PM, Buddhika Chamith Kahawitage Don wrote:
Please find my responses inline.

On Mon, Jul 21, 2014 at 1:48 PM, Bill Williams <bill@xxxxxxxxxxx
<mailto:bill@xxxxxxxxxxx>> wrote:

  On 07/21/2014 11:52 AM, Matthew LeGendre wrote:


    Presumably you're running the CodeCoverage tool in two steps: 1)
    Rewriting the binary 2) Running the rewritten binary. ÂAll of the
    analysis/rewriting overheads are in step 1, and the instrumentation
    overhead can be measured just by timing step 2.


That's true.


    If you're getting 50x overhead on just step 2 then something's very
    wrong. I've got my own codeCoverage tool (which I unfortunately
    can't
    share yet) and I only see 10% overhead.

  Hrm. If this is with a prebuilt, statically linked binary and not
  with a build from source against current Dyninst, we may also be
  hitting traps in an inner loop. That's more the right order of
  magnitude than trampguards would be--trampguards would be in the
  1.5-5x sort of neighborhood off the top of my head.


In fact that was the use case I had in my mind. But I was just checking
the static rewriting case first up since it was readily available with
code-coverage tool.


Sorry, I meant a statically linked version of the CodeCoverage tool; apologies for the confusion.


  The source for CodeCoverage (which you can build against the latest
  Dyninst and be reasonably sure of *not* hitting traps in almost all
  of SPEC) is in our tools.git repository. I know we've fixed some
  performance regressions that turned up between the AWAT paper and 8.1.2.


I am using dyninst 8.1.2 which I built from source.

Then yes, it's probably trap overhead, and 8.2 should fix it--I believe h264 was on the list of benchmarks that had a performance regression that we've fixed for the current release.

If you set DYNINST_DEBUG_SPRINGBOARD=1 in your environment and send me the output of the rewriting pass with that enabled, I'll be able to confirm the cause (and status) of this problem.



    Just an educated guess--I frequently see big overheads
    associated with
    trampoline guards. ÂDyninst should have realized trampoline
    guards are
    unnecessary for codeCoverage and not emited them. ÂBut if
    something went
    wrong you can manually turn them off by putting a call to:

      bpatch.setTrampRecursive(true)__;



Tried it without any success :(


    Near the top of codeCoverage.C's main() function. ÂIf that makes a
    difference then let the list know. ÂThat implies there's a bug that
    should be investigated.


Any ideas on how to debug this?

Thanks
Bud


--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx


[← Prev in Thread] Current Thread [Next in Thread→]