Re: [DynInst_API:] DynInst Overhead


Date: Mon, 21 Jul 2014 15:01:02 -0500
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] DynInst Overhead
I just cleared the message with the full log to the list, but yes, there are some traps being installed (grep -C2 "ret conflict" to see where they're going, but there are a nontrivial number of them). The only SPEC benchmark that we can instrument cleanly (so not omnetpp or povray) that still suffers from serious trap overhead on the 8.2 branch, AFAIK, is gcc--and that's on the order of 50%, not 50x.

On 07/21/2014 02:55 PM, Buddhika Chamith Kahawitage Don wrote:
My earlier mail is being held for the moderator approval. Anyway let me
just paste a small snippet from the output. Hope that should be enough.


    createRelocSpringboards for 400dd6
    Looking for addr b7fb96 in function _init
    getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
    getRelocAddrs for orig addr 400dd6 /w/ block start 400dd6
    Adding branch: 400dd6 -> 400ddb
          Inserting taken space 400dd6 -> 400ddb /w/ range 0
    Generated springboard branch 400dd1->b7fafe
    Conflict called for 400dd1->400dd6
          looking for 400dd1
              Found 400dd1 -> 400dd6 /w/ state 1e
          No conflict, we're good
    createRelocSpringboards for 400dd1
    Looking for addr b7fafe in function _init
    getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
    getRelocAddrs for orig addr 400dd1 /w/ block start 400dd1
    Adding branch: 400dd1 -> 400dd6
          Inserting taken space 400dd1 -> 400dd6 /w/ range 0
    Installing 15980 springboards!




On Mon, Jul 21, 2014 at 3:41 PM, Buddhika Chamith Kahawitage Don
<budkahaw@xxxxxxxxxxxx <mailto:budkahaw@xxxxxxxxxxxx>> wrote:

    Please find the output in attached file.

    Regards
    Bud


    On Mon, Jul 21, 2014 at 3:13 PM, Bill Williams <bill@xxxxxxxxxxx
    <mailto:bill@xxxxxxxxxxx>> wrote:

        On 07/21/2014 01:59 PM, Buddhika Chamith Kahawitage Don wrote:

            Please find my responses inline.

            On Mon, Jul 21, 2014 at 1:48 PM, Bill Williams
            <bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>
            <mailto:bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>>> wrote:

                 On 07/21/2014 11:52 AM, Matthew LeGendre wrote:


                     Presumably you're running the CodeCoverage tool in
            two steps: 1)
                     Rewriting the binary 2) Running the rewritten
            binary.  All of the
                     analysis/rewriting overheads are in step 1, and the
            instrumentation
                     overhead can be measured just by timing step 2.


            That's true.


                     If you're getting 50x overhead on just step 2 then
            something's very
                     wrong. I've got my own codeCoverage tool (which I
            unfortunately
                     can't
                     share yet) and I only see 10% overhead.

                 Hrm. If this is with a prebuilt, statically linked
            binary and not
                 with a build from source against current Dyninst, we
            may also be
                 hitting traps in an inner loop. That's more the right
            order of
                 magnitude than trampguards would be--trampguards would
            be in the
                 1.5-5x sort of neighborhood off the top of my head.


            In fact that was the use case I had in my mind. But I was
            just checking
            the static rewriting case first up since it was readily
            available with
            code-coverage tool.


        Sorry, I meant a statically linked version of the CodeCoverage
        tool; apologies for the confusion.


                 The source for CodeCoverage (which you can build
            against the latest
                 Dyninst and be reasonably sure of *not* hitting traps
            in almost all
                 of SPEC) is in our tools.git repository. I know we've
            fixed some
                 performance regressions that turned up between the AWAT
            paper and 8.1.2.


            I am using dyninst 8.1.2 which I built from source.

        Then yes, it's probably trap overhead, and 8.2 should fix it--I
        believe h264 was on the list of benchmarks that had a
        performance regression that we've fixed for the current release.

        If you set DYNINST_DEBUG_SPRINGBOARD=1 in your environment and
        send me the output of the rewriting pass with that enabled, I'll
        be able to confirm the cause (and status) of this problem.



                     Just an educated guess--I frequently see big overheads
                     associated with
                     trampoline guards.  Dyninst should have realized
            trampoline
                     guards are
                     unnecessary for codeCoverage and not emited them.
              But if
                     something went
                     wrong you can manually turn them off by putting a
            call to:

                         bpatch.setTrampRecursive(true)____;



            Tried it without any success :(


                     Near the top of codeCoverage.C's main() function.
              If that makes a
                     difference then let the list know.  That implies
            there's a bug that
                     should be investigated.


            Any ideas on how to debug this?

            Thanks
            Bud



        --
        --bw

        Bill Williams
        Paradyn Project
        bill@xxxxxxxxxxx <mailto:bill@xxxxxxxxxxx>





--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]