Re: [DynInst_API:] DynInst Overhead


Date: Mon, 21 Jul 2014 16:04:59 -0400
From: Buddhika Chamith Kahawitage Don <budkahaw@xxxxxxxxxxxx>
Subject: Re: [DynInst_API:] DynInst Overhead

Where can I get the sources of 8.2? I didn't see any links for 8.2 in the site.

Regards
Bud

Sent from my mobile.

On Jul 21, 2014 4:01 PM, "Bill Williams" <bill@xxxxxxxxxxx> wrote:
I just cleared the message with the full log to the list, but yes, there are some traps being installed (grep -C2 "ret conflict" to see where they're going, but there are a nontrivial number of them). The only SPEC benchmark that we can instrument cleanly (so not omnetpp or povray) that still suffers from serious trap overhead on the 8.2 branch, AFAIK, is gcc--and that's on the order of 50%, not 50x.

On 07/21/2014 02:55 PM, Buddhika Chamith Kahawitage Don wrote:
My earlier mail is being held for the moderator approval. Anyway let me
just paste a small snippet from the output. Hope that should be enough.


  createRelocSpringboards for 400dd6
  Looking for addr b7fb96 in function _init
  getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
  getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
  Adding branch: 400dd6 -> 400ddb
     Inserting taken space 400dd6 -> 400ddb /w/ range 0
  Generated springboard branch 400dd1->b7fafe
  Conflict called for 400dd1->400dd6
     looking for 400dd1
       Found 400dd1 -> 400dd6 /w/ state 1e
     No conflict, we're good
  createRelocSpringboards for 400dd1
  Looking for addr b7fafe in function _init
  getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
  getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
  Adding branch: 400dd1 -> 400dd6
     Inserting taken space 400dd1 -> 400dd6 /w/ range 0
  Installing 15980 springboards!




On Mon, Jul 21, 2014 at 3:41 PM, Buddhika Chamith Kahawitage Don
<budkahaw@xxxxxxxxxxxx <mailto:budkahaw@xxxxxxxxxxxx>> wrote:

  Please find the output in attached file.

  Regards
  Bud


  On Mon, Jul 21, 2014 at 3:13 PM, Bill Williams <bill@xxxxxxxxxxx
  <mailto:bill@xxxxxxxxxxx>> wrote:

    On 07/21/2014 01:59 PM, Buddhika Chamith Kahawitage Don wrote:

      Please find my responses inline.

      On Mon, Jul 21, 2014 at 1:48 PM, Bill Williams
      <bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>
      <mailto:bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>>> wrote:

        ÂOn 07/21/2014 11:52 AM, Matthew LeGendre wrote:


          ÂPresumably you're running the CodeCoverage tool in
      two steps: 1)
          ÂRewriting the binary 2) Running the rewritten
      binary. ÂAll of the
          Âanalysis/rewriting overheads are in step 1, and the
      instrumentation
          Âoverhead can be measured just by timing step 2.


      That's true.


          ÂIf you're getting 50x overhead on just step 2 then
      something's very
          Âwrong. I've got my own codeCoverage tool (which I
      unfortunately
          Âcan't
          Âshare yet) and I only see 10% overhead.

        ÂHrm. If this is with a prebuilt, statically linked
      binary and not
        Âwith a build from source against current Dyninst, we
      may also be
        Âhitting traps in an inner loop. That's more the right
      order of
        Âmagnitude than trampguards would be--trampguards would
      be in the
        Â1.5-5x sort of neighborhood off the top of my head.


      In fact that was the use case I had in my mind. But I was
      just checking
      the static rewriting case first up since it was readily
      available with
      code-coverage tool.


    Sorry, I meant a statically linked version of the CodeCoverage
    tool; apologies for the confusion.


        ÂThe source for CodeCoverage (which you can build
      against the latest
        ÂDyninst and be reasonably sure of *not* hitting traps
      in almost all
        Âof SPEC) is in our tools.git repository. I know we've
      fixed some
        Âperformance regressions that turned up between the AWAT
      paper and 8.1.2.


      I am using dyninst 8.1.2 which I built from source.

    Then yes, it's probably trap overhead, and 8.2 should fix it--I
    believe h264 was on the list of benchmarks that had a
    performance regression that we've fixed for the current release.

    If you set DYNINST_DEBUG_SPRINGBOARD=1 in your environment and
    send me the output of the rewriting pass with that enabled, I'll
    be able to confirm the cause (and status) of this problem.



          ÂJust an educated guess--I frequently see big overheads
          Âassociated with
          Âtrampoline guards. ÂDyninst should have realized
      trampoline
          Âguards are
          Âunnecessary for codeCoverage and not emited them.
       But if
          Âsomething went
          Âwrong you can manually turn them off by putting a
      call to:

            Âbpatch.setTrampRecursive(true)____;



      Tried it without any success :(


          ÂNear the top of codeCoverage.C's main() function.
       If that makes a
          Âdifference then let the list know. ÂThat implies
      there's a bug that
          Âshould be investigated.


      Any ideas on how to debug this?

      Thanks
      Bud



    --
    --bw

    Bill Williams
    Paradyn Project
    bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>





--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]