[DynInst_API:] segfault in symtab::openFile and ELF wrapper


Date: Mon, 07 Nov 2016 15:05:39 -0600
From: "Mark W. Krentel" <krentel@xxxxxxxx>
Subject: [DynInst_API:] segfault in symtab::openFile and ELF wrapper
I'm seeing a very bad bug in recent dyninst master.  "Bad" means that
I get a segfault in Symtab::openFile().

For me, all I need in a test program is basically:

   Symtab * the_symtab = NULL;
   Symtab::openFile(the_symtab, argv[1]);

I get a segfault and stack trace like this:

#0 0x00007f8988bbf683 in Dyninst::Elf_X_Data::d_buf (this=this@entry=0x7fffde3bd530) at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/elf/src/Elf_X.C:832

#1 0x00007f898907b879 in Dyninst::SymtabAPI::Object::loaded_elf (this=this@entry=0x1c81130, txtaddr=txtaddr@entry=@0x7fffde3bd740: 0, dataddr=dataddr@entry=@0x7fffde3bd748: 0, bssscnp=@0x7fffde3bd708: 0x0, symscnp=@0x7fffde3bd710: 0x0, strscnp=@0x7fffde3bd718: 0x0, stabscnp=@0x7fffde3bd720: 0x0, stabstrscnp=@0x7fffde3bd728: 0x0, stabs_indxcnp=@0x7fffde3bd730: 0x0, stabstrs_indxcnp=@0x7fffde3bd738: 0x0, rel_plt_scnp=@0x7fffde3bd750: 0x0, plt_scnp=@0x7fffde3bd758: 0x0, got_scnp=@0x7fffde3bd760: 0x0, dynsym_scnp=@0x7fffde3bd768: 0x0, dynstr_scnp=@0x7fffde3bd770: 0x0, dynamic_scnp=@0x7fffde3bd778: 0x0, eh_frame=@0x7fffde3bd780: 0x0, gcc_except=@0x7fffde3bd788: 0x0, interp_scnp=@0x7fffde3bd790: 0x0, opd_scnp=@0x7fffde3bd798: 0x0) at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/symtabAPI/src/Object-elf.C:655

#2 0x00007f8989080993 in Dyninst::SymtabAPI::Object::load_object (this=this@entry=0x1c81130, alloc_syms=alloc_syms@entry=true) at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/symtabAPI/src/Object-elf.C:1525

#3 0x00007f8989081cec in Dyninst::SymtabAPI::Object::Object (this=0x1c81130, mf_=0x1c810e0, err_func=<optimized out>,
    alloc_syms=<optimized out>, st=<optimized out>)
at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/symtabAPI/src/Object-elf.C:2928

#4 0x00007f898903d1c2 in Dyninst::SymtabAPI::Symtab::Symtab (this=0x1c7fec0, filename="", defensive_bin=<optimized out>, err=@0x7fffde3bd95e: false) at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/symtabAPI/src/Symtab.C:1270

#5 0x00007f898903d667 in Dyninst::SymtabAPI::Symtab::openFile (obj=@0x6021e0: 0x0, filename="parse",
    def_binary=def_binary@entry=Dyninst::SymtabAPI::Symtab::NotDefensive)
at /home/krentel/struct/BUILD-a79ba00b9/symtabAPI/dyninst/symtabAPI/src/Symtab.C:2102

#6 0x0000000000400e25 in main (argc=<optimized out>, argv=<optimized out>) at parse.cpp:143


For me, this happens on pretty much any binary.  Run it on itself, run
it on /bin/ls, anything.

One possible theory is that since if fails inside the Dyninst ELF
wrapper, maybe we're using two different versions of libelf.  I'm
still using the old libelf-0.8.13.  But if you've moved on to the
newer elfutils and never test on libelf, that could be the issue.

I looked back through git log --graph.  This is what I found.

It segfaults in the current dyninst master (where the stack trace
is from).

commit a79ba00b9cd3f9b690bd73100c196ad201971a7f
Author: Peter Foley <pefoley2@xxxxxxxxxxx>
Date:   Sun Nov 6 11:35:19 2016 -0500

    fix AddressRange forward declarations

Following down various forks in the graph of ancestors, it also
segfaults in:

commit 74d5bea71e17116559870653620de7cb3775b0c3
Author: Bill Williams <bill@xxxxxxxxxxx>
Date:   Wed Oct 19 15:10:23 2016 -0500

    Code cleanup: we don't need to handle the cases where there's no
    module range info.

Earlier than that, the build fails for several revs.  but I eventually
get down to where the build fails here:

commit 14f24436129ec7daf2e86dc006c808bdf1fb33cf
Author: Bill Williams <bill@xxxxxxxxxxx>
Date:   Fri Jul 29 14:31:24 2016 -0500

    Cache module DIEs and build ranges as interval trees.

And in the next rev, it works, where "works" means that
Symtab::openFile() doesn't segfault.

commit c52f9e410d040170e9e711ccacad7e61b6975a03
Author: Bill Williams <bill@xxxxxxxxxxx>
Date:   Wed Jul 13 13:44:45 2016 -0500

    Warning cleanup.


So, I'm unable to find a specific rev where it goes from working to
broken because this is covered up by the build failures.

Unfortunately, I find this often happens in git repositories.  You get
one piece of poison in some rev in some other repository that isn't
sufficiently tested, and that infects everything all the way into
where it's merged into master.  Blech.

Anyway, the first questions are: (1) do you see the same bug?
(2) is it from a difference in libelf?

If not, then it looks like a serious bug, although I'm hoping it's
small and easily fixed.

Thanks,

--Mark

[← Prev in Thread] Current Thread [Next in Thread→]