Re: [DynInst_API:] Not getting sources lines question: getSourceLines API


Date: Thu, 01 Oct 2015 14:21:14 -0500
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] Not getting sources lines question: getSourceLines API
On 10/01/2015 01:38 PM, Jim Galarowicz wrote:
Hi all,

I'm still not able to get source line information for use with the new Dyninst 9.0.x loop interface change for finding the head node of the loop in order to give the users the start line number for the loop. 
That simple dyninst interface was removed in 9.0.x.

I'm computing an address from the basic blocks in the loop and passing that to the module class getSourceLines function:
bool getSourceLines( unsigned long addr, std::vector<BPatch_statement> & lines )

Department of potentially dumb questions:

1) Have you already grabbed all of the pre-9.0.4 commits (which include at least one line info fix)?
2) If you move the line info query outside of the 9.0.x ifdef and try to get information for the 8.2 loop header, does this work (obviously, with Dyninst 8.2)? IE did we regress or is this consistent behavior for this method/this binary?
3) Is O|SS getting line information elsewhere (call sites, function entry, whatever) successfully from Dyninst? Unsuccessfully?

Suggestions/observations:

1) For irreducible loops, does it make sense to report each potential entry point to the user rather than picking one to identify it?
2) If you must pick one entry point, do you reasonably expect that "lowest line number" is any more accurate for an irreducible loop than "lowest address"? (Which is to say they'll both be wrong a nontrivial amount of the time, I expect.)
3) Do you actually want to treat irreducible loops the same way that you treat single-entry loops? My intuition is that they'll tend not to be amenable to the same sorts of optimizations or represent the same sorts of source-level concepts, but that's not backed by anything quantitative.

--bw

I've tried passing the basic block starting address as the address for the above call
Then I tried taking the module base address off the starting address and tried that.

Those two address tries didn't produce any line info.

So, the big picture is: 

   BPatch_Vector<BPatch_module*>* modules = image.getModules();
  
   BPatch_Vector<BPatch_function*> functions;
        module->findFunctionByAddress(
            (void*)((module_base + address).getValue()), functions, false, false
            );

   BPatch_flowGraph* cfg = function->getCFG();

   BPatch_Vector<BPatch_basicBlockLoop*> loops;
   cfg->getLoops(loops);

   for .... loop through all the loops
      BPatch_basicBlockLoop* loop = loops[l];

                   std::vector<BPatch_basicBlock*> entries;
                   loop->getLoopEntries(entries);
                   for (bbe = entries.begin(); bbe != entries.end(); ++bbe) {
                   
                     unsigned long module_base = (uint64_t)module->getBaseAddr();
                     unsigned long bbstartAddr =  (*bbe)->getStartAddress() - module_base;
                    
                     bool linesFound = module->getSourceLines( bbstartAddr, filesAndlines);

                     /* never find any lines */


This is the full function (below).   Am I using the correct API calls to get the line information?  Does anyone have a suggestion?

Thanks,
Jim G

/** Get the loops containing the specified address. */
std::vector<LoopInfo> getLoopsAt(const Address& address, BPatch_image& image)
{
    std::vector<LoopInfo> retval;
   
    // Iterate over each module within the specified image
   
    BPatch_Vector<BPatch_module*>* modules = image.getModules();
   
    if (modules == NULL)
    {
        return retval;
    }
   
    for (unsigned int m = 0; m < modules->size(); ++m)
    {
        BPatch_module* module = (*modules)[m];
       
        if (module == NULL)
        {
            continue;
        }
       
        Address module_base = (uint64_t)module->getBaseAddr();

        // Find the function(s) containing the specified address

        BPatch_Vector<BPatch_function*> functions;
        module->findFunctionByAddress(
            (void*)((module_base + address).getValue()), functions, false, false
            );
       
        for (unsigned int f = 0; f < functions.size(); ++f)
        {
            BPatch_function* function = functions[f];
           
            if (function == NULL)
            {
                continue;
            }

            // Find the loops containing the specified address
           
            BPatch_flowGraph* cfg = function->getCFG();
           
            if (cfg == NULL)
            {
                continue;
            }
           
            BPatch_Vector<BPatch_basicBlockLoop*> loops;
            cfg->getLoops(loops);
           
            for (unsigned int l = 0; l < loops.size(); ++l)
            {
                BPatch_basicBlockLoop* loop = loops[l];
               
                if ((loop == NULL) || !loop->containsAddressInclusive(
                        (module_base + address).getValue()
                        ))
                {
                    continue;
                }
               
                // A loop containing this address has been found! Rejoice!
                // And, of course, obtain the loop's head address and basic
                // block address ranges...
               

                #if DyninstAPI_VERSION_MAJOR >= 9

                   // Need to use the new dyninst API for finding the head of the loop
                   // One possibility - from wdh - might be to call getLoopEntries() to
                   // get the basic block of each entry. Then, for each of these, take
                   // the first address of that basic block and query the source
                   // file/line containing that address. Assuming that all line numbers are within
                   // a single source file, the minimum line number is probably reasonably the
                   // loop definition. And the first address in that basic block would be the one
                   // to use for “addr_head” in the Open|SS database.

                   BPatch_basicBlock* head;
                   std::vector<BPatch_basicBlock*> entries;

                   loop->getLoopEntries(entries);

                   std::cerr << " entries.size()=" << entries.size() << std::endl;

                   // bbe: Loop through the basic block entries
                   std::vector<BPatch_basicBlock*>::iterator bbe;

                   // filesAndlines: Return value file names and line numbers from getSourceLines
                   std::vector<BPatch_statement > filesAndlines ;
                 
                   // Loop through the loops basic blocks, get the starting address of the block
                   // Then use that address to get the filename and line number for that address
                   // We are looking for the minimum line number for the blocks in the loop to use
                   // as the loop head basic block.

                   head = entries[0]; // give an initial value to the loop head
                   for (bbe = entries.begin(); bbe != entries.end(); ++bbe) {
                    
                     unsigned long module_base = (uint64_t)module->getBaseAddr();
                     unsigned long bbstartAddr =  (*bbe)->getStartAddress() - module_base;
                    
                     bool linesFound = module->getSourceLines( bbstartAddr, filesAndlines);

                     if (linesFound) {
                        std::vector<BPatch_statement>::iterator lf_dx;
                        for (lf_dx = filesAndlines.begin(); lf_dx != filesAndlines.end(); ++lf_dx) {
                            std::cerr << "fileName=" << (*lf_dx).fileName() << " lineNumber=" << (*lf_dx).lineNumber() << std::endl;
                        }
                        
                      } // linesFound
                   
                   } // entries
                #else
                   BPatch_basicBlock* head = loop->getLoopHead();
                #endif
               
                if (head == NULL)
                {
                    continue;
                }

                // Use the loop head basic block to create the necessary loop information to return
                LoopInfo info(Address(head->getStartAddress()) - module_base);

                BPatch_Vector<BPatch_basicBlock*> blocks;
                loop->getLoopBasicBlocks(blocks);
               
                for (unsigned int i = 0; i < blocks.size(); ++i)
                {
                    BPatch_basicBlock* block = blocks[i];

                    if (block != NULL)
                    {
                        info.dm_ranges.push_back(
                            AddressRange(
                                Address(block->getStartAddress()) - module_base,
                                Address(block->getEndAddress()) - module_base
                                )
                            );
                    }
                }
               
                retval.push_back(info);
               
            } // l
        } // f
    } // m

    return retval;
}

On 09/11/2015 02:42 PM, Jim Galarowicz wrote:


Hi all,

I'm trying to use the new loop API to find the head basic block for loops in the applications we are doing performance analysis on.
The old api had a function for that:
BPatch_basicBlock* head = loop->getLoopHead();
but now, we need to figure it out on our own.

I must be doing something wrong because I never get any source lines returned from the
bool linesFound = module->getSourceLines( bbstartAddr, filesAndlines);
call below.   The new code is bracketed by the #if DyninstAPI_VERSION_MAJOR >= 9 line below.

Does anyone see anything obvious that I'm doing wrong? 

I checked for the presence of the dwarf information via dwarfdump and there appears to be statement information present.

This is the function from DyninstSymbols.cxx from our OpenSpeedShop code, that I'm working on to get loop info:

/** Get the loops containing the specified address. */
std::vector<LoopInfo> getLoopsAt(const Address& address, BPatch_image& image)
{
    std::vector<LoopInfo> retval;
    
    // Iterate over each module within the specified image
    
    BPatch_Vector<BPatch_module*>* modules = image.getModules();
    
    if (modules == NULL)
    {
        return retval;
    }
    
    for (unsigned int m = 0; m < modules->size(); ++m)
    {
        BPatch_module* module = (*modules)[m];
        
        if (module == NULL)
        {
            continue;
        }

        
        Address module_base = (uint64_t)module->getBaseAddr();

        // Find the function(s) containing the specified address

        BPatch_Vector<BPatch_function*> functions;
        module->findFunctionByAddress(
            (void*)((module_base + address).getValue()), functions, false, false
            );
        
        for (unsigned int f = 0; f < functions.size(); ++f)
        {
            BPatch_function* function = functions[f];
            
            if (function == NULL)
            {
                continue;
            }

            // Find the loops containing the specified address
            
            BPatch_flowGraph* cfg = function->getCFG();
            
            if (cfg == NULL)
            {
                continue;
            }
            
            BPatch_Vector<BPatch_basicBlockLoop*> loops;
            cfg->getLoops(loops);
            
            for (unsigned int l = 0; l < loops.size(); ++l)
            {
                BPatch_basicBlockLoop* loop = loops[l];
                
                if ((loop == NULL) || !loop->containsAddressInclusive(
                        (module_base + address).getValue()
                        ))
                {
                    continue;
                }
                
                // A loop containing this address has been found! Rejoice!
                // And, of course, obtain the loop's head address and basic
                // block address ranges...

                

                #if DyninstAPI_VERSION_MAJOR >= 9

                   // Need to use the new dyninst API for finding the head of the loop
                   // One possibility - from wdh - might be to call getLoopEntries() to
                   // get the basic block of each entry. Then, for each of these, take
                   // the first address of that basic block and query the source
                   // file/line containing that address. Assuming that all line numbers are within
                   // a single source file, the minimum line number is probably reasonably the
                   // loop definition. And the first address in that basic block would be the one
                   // to use for “addr_head” in the Open|SS database.

                   BPatch_basicBlock* head;
                   std::vector<BPatch_basicBlock*> entries;
                   loop->getLoopEntries(entries);
                  
                   std::cerr << " entries.size()=" << entries.size() << std::endl;

                   // bbe: Loop through the basic block entries
                   std::vector<BPatch_basicBlock*>::iterator bbe;

                   std::cerr << "entries.begin() != entries.end()=" << (entries.begin() != entries.end()) << std::endl;

                   // filesAndlines: Return value file names and line numbers from getSourceLines
                   std::vector<BPatch_statement > filesAndlines ;
                  
                   // Loop through the loops basic blocks, get the starting address of the block
                   // Then use that address to get the filename and line number for that address
                   // We are looking for the minimum line number for the blocks in the loop to use
                   // as the loop head basic block.

                   head = entries[0]; // give an intial value to the loop head
                   for (bbe = entries.begin(); bbe != entries.end(); ++bbe) {
                     
                     
                     unsigned long bbstartAddr = (*bbe)->getStartAddress();

                     std::cerr << "module=" << module << " bbe=" << (*bbe) << " bbstartAddr=" << std::hex << bbstartAddr << std::endl;

                     bool linesFound = module->getSourceLines( bbstartAddr, filesAndlines);

                     std::cerr << "linesFound=" << linesFound << std::endl;
                     std::cerr << " filesAlines.size()=" << filesAndlines.size() << std::endl;

                     if (linesFound) {
                        std::cerr << " filesAlines.size()=" << filesAndlines.size() << std::endl;
                        std::vector<BPatch_statement>::iterator lf_dx;
                        for (lf_dx = filesAndlines.begin(); lf_dx != filesAndlines.end(); ++lf_dx) {
                            std::cerr << "fileName=" << (*lf_dx).fileName() << " lineNumber=" << (*lf_dx).lineNumber() << std::endl;
                        }
                         
                      } // linesFound
                    
                   } // entries
                #else
                   BPatch_basicBlock* head = loop->getLoopHead();
                #endif
                
                if (head == NULL)
                {
                    continue;
                }

                // Use the loop head basic block to create the necessary loop information to return
                LoopInfo info(Address(head->getStartAddress()) - module_base);

                BPatch_Vector<BPatch_basicBlock*> blocks;
                loop->getLoopBasicBlocks(blocks);
                
                for (unsigned int i = 0; i < blocks.size(); ++i)
                {
                    BPatch_basicBlock* block = blocks[i];

                    if (block != NULL)
                    {
                        info.dm_ranges.push_back(
                            AddressRange(
                                Address(block->getStartAddress()) - module_base,
                                Address(block->getEndAddress()) - module_base
                                )
                            );
                    }
                }
                
                retval.push_back(info);
                
            } // l
        } // f
    } // m

    return retval;
}

I never get any lines returned from getSourceLines even though I can see them in the dwarfdump output.
I attached the full dwarfdump output and the smg2000 run with the debug output (snippet below).

[openss]: Converting raw data from /opt/shared/offline-oss into temp file X.0.openss

Processing raw data for smg2000 ...
Processing processes and threads ...
Processing performance data ...
Processing symbols ...
Resolving symbols for /home/jeg/DEMOS/demos/mpi/openmpi-1.8.2/smg2000/test/smg2000
 entries.size()=1
entries.begin() != entries.end()=1
module=0x1bd6810 bbe=0x200f980 bbstartAddr=4020d3
linesFound=0
 filesAlines.size()=0
 entries.size()=1
entries.begin() != entries.end()=1
module=0x1bd6810 bbe=0x20111e0 bbstartAddr=40215f
linesFound=0
 filesAlines.size()=0
 entries.size()=1
entries.begin() != entries.end()=1
module=0x1bd6810 bbe=0x2012b70 bbstartAddr=4021f3
linesFound=0
 filesAlines.size()=0
 entries.size()=1
entries.begin() != entries.end()=1
module=0x1bd6810 bbe=0x20464a0 bbstartAddr=4036a9
linesFound=0
...
...






_______________________________________________
Dyninst-api mailing list
Dyninst-api@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

[← Prev in Thread] Current Thread [Next in Thread→]