Using Danielâs gfx_model.x binary, I confirmed - (bad) that hpcstruct in hpctoolkit version 2024.01.1 based on Dyninst 13.0.0 fails with binary
- (good) the Dyninst problem for analyzing DWARF subrange information from Fortran applications has been fixed in Dyninst master.
Unfortunately, Dyninst master is not usable with the HPCToolkit 2024.01.1 release. However, the updated version of Dyninst is usable with HPCToolkitâs develop branch. Unfortunately, the spack recipe for deploying our develop branch seems to be missing a few library paths that donât get baked in by spack. I will report back to this list when we have fixed HPCToolkit's spack recipe so you can use our develop branch.
Best,
John -- John Mellor-Crummey Professor Dept of Computer Science Rice University email: johnmc@xxxxxxxx phone: 713-348-5179
On May 12, 2025, at 10:26âAM, Daniel Kokron - NOAA Affiliate <daniel.kokron@xxxxxxxx> wrote:
Ahhhh, that explains the following and how to get around it. Thank you.
WARNING: Skipping DWARF for gfs_model.x, over threshold (377978416 > 104857600) On Mon, May 12, 2025 at 10:13âAM John Mellor-Crummey < johnmc@xxxxxxxx> wrote: Daniel,
One more thing:
While we work on resolving the issue with hpcstruct, you should be able to run hpcprof on your measurement data even if hpcstruct failed to analyze this binary. hpcprof includes the ability to read DWARF (using a different library that shouldnât crash).
When you run hpcprof, you should use
hpcprof --dwarf-max-size=unlimited <measurement directory>
Best,
John -- John Mellor-Crummey Professor Dept of Computer Science Rice University email: johnmc@xxxxxxxx phone: 713-348-5179
Got permission to share the executable. Link sent. I'll ask about providing the executable.
On Fri, May 9, 2025 at 1:52âPM John Mellor-Crummey < johnmc@xxxxxxxx> wrote: Hi Daniel,
Thanks for the callstack.
The problem seems to be exactly the same one recently encountered by Doug Pase for a Fortran program at Sandia. This is a problem inside the type processing by the Dyninst software written by our collaborators.
Can you share a binary with us to facilitate debugging? The Sandia binary is export controlled and only accessible inside their firewall. Having a non-export controlled binary for debugging would make our lives easier.
Best,
John -- John Mellor-Crummey Professor Dept of Computer Science Rice University email: johnmc@xxxxxxxx phone: 713-348-5179
The application is compiled with Intel ifort. HPCToolkit and its dependencies are compiled with gcc-13.2.1. I attached the spec for HPCToolkit.
(gdb) run --nocache /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x Starting program: /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct --nocache /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x Missing separate debuginfos, use: zypper install glibc-debuginfo-2.31-150300.63.1.x86_64 Missing separate debuginfo for /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libstdc++.so.6 Try: zypper install -C "debuginfo(build-id)=c74eca671e2dd0f063706372d103f8acef88f1e3" Missing separate debuginfo for /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1 Try: zypper install -C "debuginfo(build-id)=54684492738e640bcd600e830cee025dd8771a20" Missing separate debuginfo for /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgcc_s.so.1 Try: zypper install -C "debuginfo(build-id)=12f775ec4aeb94b749897b1b65638f18b61d1b1f" [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". begin sequential analysis of CPU binary /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x (size = 377978672, threads = 1) hpcstruct: /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/boost-1.87.0-2cldxfpwec5rbbhxutja5lwcgzh6fbhc/include/boost/smart_ptr/shared_ptr.hpp:550: typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = Dyninst::SymtabAPI::typeSubrange; typename boost::detail::sp_member_access<T>::type = Dyninst::SymtabAPI::typeSubrange*]: Assertion `px != 0' failed.
Program received signal SIGABRT, Aborted. 0x0000155553e2fd2b in raise () from /lib64/libc.so.6 (gdb) where #0 0x0000155553e2fd2b in raise () from /lib64/libc.so.6 #1 0x0000155553e313e5 in abort () from /lib64/libc.so.6 #2 0x0000155553e27c6a in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000155553e27cf2 in __assert_fail () from /lib64/libc.so.6 #4 0x0000155554d65127 in boost::enable_if<boost::integral_constant<bool, !((bool)boost::is_same<Dyninst::SymtabAPI::Type, Dyninst::SymtabAPI::typeSubrange>::value)>, boost::shared_ptr<Dyninst::SymtabAPI::Type> >::type Dyninst::SymtabAPI::typeCollection::addOrUpdateType<Dyninst::SymtabAPI::typeSubrange>(boost::shared_ptr<Dyninst::SymtabAPI::typeSubrange>) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #5 0x0000155554d547e6 in Dyninst::SymtabAPI::DwarfWalker::parseSubrange() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #6 0x0000155554d5a0a8 in Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #7 0x0000155554d5b235 in Dyninst::SymtabAPI::DwarfWalker::findAnyType(Dwarf_Attribute, bool, boost::shared_ptr<Dyninst::SymtabAPI::Type>&) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #8 0x0000155554d5b732 in Dyninst::SymtabAPI::DwarfWalker::findType(boost::shared_ptr<Dyninst::SymtabAPI::Type>&, bool) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #9 0x0000155554d5497b in Dyninst::SymtabAPI::DwarfWalker::parseArray() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #10 0x0000155554d59fb8 in Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #11 0x0000155554d5a53b in Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #12 0x0000155554d5bab0 in Dyninst::SymtabAPI::DwarfWalker::parseModule(Dwarf_Die, Dyninst::SymtabAPI::Module*&) [clone .constprop.0] [clone .isra.0] () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #13 0x0000155554d5c15c in Dyninst::SymtabAPI::DwarfWalker::parse() [clone ._omp_fn.0] () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #14 0x000015555403b306 in GOMP_parallel () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1 #15 0x0000155554d5d2ed in Dyninst::SymtabAPI::DwarfWalker::parse() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #16 0x0000155554d0a0c1 in Dyninst::SymtabAPI::Object::parseTypeInfo() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #17 0x0000155554cd48a7 in Dyninst::SymtabAPI::Symtab::parseTypes() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #18 0x0000155553fef5d7 in __pthread_once_slow () from /lib64/libpthread.so.0 #19 0x0000155554ccc8b4 in Dyninst::SymtabAPI::Symtab::parseTypesNow() () from /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 #20 0x00000000004418c8 in Inline::openSymtab (elfFile=elfFile@entry=0x8d94b0) at Struct-Inline.cpp:132 #21 0x000000000043cb31 in BAnal::Struct::makeStructure (filename=..., outFile=outFile@entry=0x830ab0, gapsFile=gapsFile@entry=0x0, gaps_filenm=..., search_path=..., structOpts=...) at Struct.cpp:770 #22 0x000000000042cb42 in doSingleBinary (args=..., sb=sb@entry=0x7ffffffd8740) at /usr/include/c++/13/bits/basic_string.tcc:238 #23 0x0000000000412cfd in realmain (argc=<optimized out>, argv=<optimized out>) at main.cpp:209 #24 0x000000000041220a in main (argc=<optimized out>, argv=<optimized out>) at main.cpp:137 On Fri, May 9, 2025 at 11:16âAM John Mellor-Crummey < johnmc@xxxxxxxx> wrote: Hi Daniel,
You should be able to run hpcstruct under gdb and then run it directly on the offending binary as follows
gdb `which hpcstruct` run --nocache /path/to/gfs_model
Then, you can send us a call path. By any chance is this a Fortran code compiled with gfortran? We are presently looking into a complaint about that from Sandia.
Best,
John -- John Mellor-Crummey Professor Dept of Computer Science Rice University email: johnmc@xxxxxxxx phone: 713-348-5179
I am encountering the following error while running hpcstruct. I cannot find the core file in any of the usual places. I have also tried running hpcstruct under gdb without getting very far.
Wondering what my debugging options are?
begin concurrent analysis of CPU binary gfs_model. (size = 377978416, threads = 1) /bin/sh: line 32: 63480 Aborted (core dumped) /spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct. --nocache -j 1 -o $struct_name -M $meas_dir /Baseline_6Hr_WithWW3Restarts_Trace.16774.rawdata/cpubins/model.x > $warn_name 2>&1
Dan
_______________________________________________ HPCToolkit-forum mailing list HPCToolkit-forum@xxxxxxxxxxxxxxxxhttps://mailman.rice.edu/mailman/listinfo/hpctoolkit-forum
|