[DynInst_API:] [dyninst/dyninst] 633f11: Preserve caller-saved GPRs clobbered by an inserte...


Date: Mon, 08 Jun 2026 17:04:03 -0700
From: bbiiggppiigg <noreply@xxxxxxxxxx>
Subject: [DynInst_API:] [dyninst/dyninst] 633f11: Preserve caller-saved GPRs clobbered by an inserte...
  Branch: refs/heads/bbiiggppiigg/handle-isra-x86
  Home:   https://github.com/dyninst/dyninst
  Commit: 633f1132e3b065235976095e4031e130adaf504d
      https://github.com/dyninst/dyninst/commit/633f1132e3b065235976095e4031e130adaf504d
  Author: wuxx1279 <bbiiggppiigg@xxxxxxxxx>
  Date:   2026-06-08 (Mon, 08 Jun 2026)

  Changed paths:
    M dyninstAPI/src/emit-x86.C

  Log Message:
  -----------
  Preserve caller-saved GPRs clobbered by an inserted instrumentation call

The base trampoline's register guard decided which registers to save using
intra-procedural liveness (shouldSaveReg in emit-x86.C). A caller-saved GPR
marked "dead" at the instrumentation point was skipped -- correct for a
standard-ABI function, since a caller never keeps a caller-saved scratch
register live across a call.

That assumption is wrong for GCC local clones (.isra/.constprop/...), which
co-allocate scratch registers across the call between a clone and its callers:
the caller legitimately keeps e.g. %r11 live across the call because the clone
promises not to touch it. That contract is invisible from the callee, so dyninst
skipped saving %r11 -- and the inserted snippet's call (here a coverage reporter
that calls printf) then clobbered it, corrupting the caller's value.

Observed on an instrumented PyTorch libtorch_python.so:
pybind11::detail::string_caster<std::string,false>::load keeps &local in %r11
across a call to std::string::operator=.isra.0; the instrumented operator='s
guard saved only {rsi,rdi}, printf clobbered %r11, and the following
_M_dispose dereferenced 0xffffffff -> SIGSEGV during `import torch`.

Fix: in shouldSaveReg, when the instrumented function is a clone
(SymtabAPI Symbol::isClone -- mangled name carries a GCC clone suffix), a
caller-saved GPR that the inserted snippet clobbers is saved even when
intra-procedural liveness marks it dead. The check is gated on isClone (and
checked first, as it is false for almost all functions) so ordinary functions
keep the liveness optimization -- they cannot have a caller holding a
caller-saved register live across the call. Callee-saved registers are
unaffected (the inserted call preserves them by ABI). No restore-side change is
needed: markSavedRegister() marks the reg spilled, the restore loop pops every
spilled reg, and the num_to_save counting loop uses the same predicate, so
saves/pops stay balanced.

Verified on x86-64: the operator=.isra.0 guard now pushes the full caller-saved
set {rax,r10,r11,r8,r9,rcx,rdx,rsi,rdi}, and instrumented `import torch` exits 0
with results identical to the uninstrumented baseline. (Note: writing an
instrumented libtorch_python.so also requires the separate insertion-point /
program-header emit fix; the run above used a build that included it.)
AArch64/PPC base tramps may need the analogous change.

Co-Authored-By: Claude Opus 4.8 <noreply@xxxxxxxxxxxxx>



To unsubscribe from these emails, change your notification settings at https://github.com/dyninst/dyninst/settings/notifications
[← Prev in Thread] Current Thread [Next in Thread→]
  • [DynInst_API:] [dyninst/dyninst] 633f11: Preserve caller-saved GPRs clobbered by an inserte..., bbiiggppiigg <=