[DynInst_API:] [dyninst/dyninst] 06bced: parseAPI: make speculative getGapRange O(log F) vi...


Date: Thu, 11 Jun 2026 18:55:03 -0700
From: bbiiggppiigg <noreply@xxxxxxxxxx>
Subject: [DynInst_API:] [dyninst/dyninst] 06bced: parseAPI: make speculative getGapRange O(log F) vi...
  Branch: refs/heads/bbiiggppiigg/faster-getgaprange
  Home:   https://github.com/dyninst/dyninst
  Commit: 06bced4b6ac142c3da36e3d4340252674c27213e
      https://github.com/dyninst/dyninst/commit/06bced4b6ac142c3da36e3d4340252674c27213e
  Author: wuxx1279 <bbiiggppiigg@xxxxxxxxx>
  Date:   2026-06-11 (Thu, 11 Jun 2026)

  Changed paths:
    M parseAPI/src/Parser-speculative.C

  Log Message:
  -----------
  parseAPI: make speculative getGapRange O(log F) via funcsByRange

Parser::getGapRange rebuilt a sorted set of *every* function's extents on
every call to locate the gap around curAddr. It is called once per gap
discovered during speculative parsing, so the speculative pass was
~O(gaps x functions). On a large, symbol-rich binary (e.g. a ~400 MB shared
library with hundreds of thousands of functions) this dominated rewrite
time -- the per-gap rebuild ran for hours while the parallel
recursive-descent parse that precedes it finished in seconds.

Query region_data::funcsByRange (an IBSTree_fast<FuncExtent> that already
indexes every function's extents by address) instead:
  - gapStart: stab curAddr (find) and advance past any covering extent;
  - gapEnd:   successor(curAddr) gives the next extent start (mirrors the
              old upper_bound logic and the blocksByRange.successor idiom in
              region_data::get_next_block).
This is O(log F) per call.

funcsByRange is filled only by finalize_ranges() (draining funcs_to_ranges),
which the parser calls lazily from findFuncs/findBlocks. finalize() fills
funcs_to_ranges but does not flush it, so at the start of the gap loop the
tree can be empty; and each gap-discovered function (parse_at downgrades
_parse_state, re-running finalize()) adds to funcs_to_ranges without
flushing. The old code read sorted_funcs, which parse_at keeps current, so
to preserve that behavior getGapRange now flushes any pending finalized
functions into the tree before querying it. Without the flush every
already-parsed function would be invisible (treated as one giant gap and
re-scanned byte-by-byte), and just-discovered gap functions could spawn
overlapping functions inside themselves.

No functional change to which gaps are scanned; behavior matches the prior
sorted_funcs-based implementation, but the speculative pass on a 256 MB
.text region drops from hours to ~1 second.

Co-Authored-By: Claude Opus 4.8 <noreply@xxxxxxxxxxxxx>



To unsubscribe from these emails, change your notification settings at https://github.com/dyninst/dyninst/settings/notifications
[← Prev in Thread] Current Thread [Next in Thread→]
  • [DynInst_API:] [dyninst/dyninst] 06bced: parseAPI: make speculative getGapRange O(log F) vi..., bbiiggppiigg <=