Re: [DynInst_API:] Removed createInstPointAtAddr


Date: Thu, 26 Apr 2012 14:51:52 -0700
From: Josh Stone <jistone@xxxxxxxxxx>
Subject: Re: [DynInst_API:] Removed createInstPointAtAddr
On 04/25/2012 07:24 PM, Andrew Bernat wrote:
> On Apr 25, 2012, at 9:09 PM, Josh Stone wrote:
> 
>> Sure - a slight background is that my test program is probing the
>> statically defined tracepoints (SDT) in libraries.  If you have a
>> current Fedora machine handy, try "readelf -n /lib64/libc-2.14.90.so" to
>> see an example of what's defined in .note.stapsdt.
> 
> Unfortunately, I don't have one of those. We're all some weird version
> of Redhat. I'll take your word for it, though :) 

Heh, depends on how weird, I suppose.  I do see .note.stapsdt sections
in RHEL5 and RHEL6 in a few libraries, like /lib64/ld-linux-x86-64.so.2,
but those versions of readelf can't parse it.

On Fedora, you get something like this:

> $ readelf -n /lib64/libc-2.14.90.so 
...
> Notes at offset 0x001b286c with length 0x00000294:
>   Owner                 Data size	Description
>   stapsdt              0x0000003a	NT_STAPSDT (SystemTap probe descriptors)
>     Provider: libc
>     Name: setjmp
>     Location: 0x0000000000036061, Base: 0x000000000017c940, Semaphore: 0x0000000000000000
>     Arguments: 8@%rdi -4@%esi 8@%rax
>   stapsdt              0x0000003b	NT_STAPSDT (SystemTap probe descriptors)
>     Provider: libc
>     Name: longjmp
>     Location: 0x0000000000036143, Base: 0x000000000017c940, Semaphore: 0x0000000000000000
>     Arguments: 8@%rdi -4@%esi 8@%rdx
>   stapsdt              0x00000042	NT_STAPSDT (SystemTap probe descriptors)
>     Provider: libc
>     Name: longjmp_target
>     Location: 0x000000000003615f, Base: 0x000000000017c940, Semaphore: 0x0000000000000000
>     Arguments: 8@%rdi -4@%eax 8@%rdx
...

>> Instrumenting most of these addresses under dyninst appears to work,
>> except for longjmp and longjmp_target.  I haven't done rigorous testing
>> on the rest, but they do fire.  But neither longjmp* works even in a
>> situation that I know should trigger it, which I can verify using stap's
>> normal kernel+uprobes instrumentation.
> 
> Which is to say that you execute longjmp, but never see the
> instrumentation trigger? 

Right.  I used Wikipedia's setjmp/longjmp example:
https://en.wikipedia.org/wiki/Setjmp#Simple_example

With SystemTap instrumenting through kernel+uprobes, I get
instrumentation running for all three of those SDT addresses: setjmp,
longjmp, and longjmp_target.  With dyninst, I only get it for setjmp.

> Can you send me the address you're trying to instrument, and disassembly
> of the entire function (with objdump or similar)? I may be able to spot
> the problem just by eye. 

Here's the function that contains longjmp and longjmp_target:

> 0000003d0d836110 <__longjmp>:
>   3d0d836110:	4c 8b 47 30          	mov    0x30(%rdi),%r8
>   3d0d836114:	4c 8b 4f 08          	mov    0x8(%rdi),%r9
>   3d0d836118:	48 8b 57 38          	mov    0x38(%rdi),%rdx
>   3d0d83611c:	49 c1 c8 11          	ror    $0x11,%r8
>   3d0d836120:	64 4c 33 04 25 30 00 	xor    %fs:0x30,%r8
>   3d0d836127:	00 00 
>   3d0d836129:	49 c1 c9 11          	ror    $0x11,%r9
>   3d0d83612d:	64 4c 33 0c 25 30 00 	xor    %fs:0x30,%r9
>   3d0d836134:	00 00 
>   3d0d836136:	48 c1 ca 11          	ror    $0x11,%rdx
>   3d0d83613a:	64 48 33 14 25 30 00 	xor    %fs:0x30,%rdx
>   3d0d836141:	00 00 
>   3d0d836143:	90                   	nop
>   3d0d836144:	48 8b 1f             	mov    (%rdi),%rbx
>   3d0d836147:	4c 8b 67 10          	mov    0x10(%rdi),%r12
>   3d0d83614b:	4c 8b 6f 18          	mov    0x18(%rdi),%r13
>   3d0d83614f:	4c 8b 77 20          	mov    0x20(%rdi),%r14
>   3d0d836153:	4c 8b 7f 28          	mov    0x28(%rdi),%r15
>   3d0d836157:	89 f0                	mov    %esi,%eax
>   3d0d836159:	4c 89 c4             	mov    %r8,%rsp
>   3d0d83615c:	4c 89 cd             	mov    %r9,%rbp
>   3d0d83615f:	90                   	nop
>   3d0d836160:	ff e2                	jmpq   *%rdx

Note that addresses here are shown prelinked, vs. the .note.stapsdt
addresses which are not, but I account for that in the runtime
load-address calculations made via BPatch_object.

Also note that the two SDT addresses contain NOPs.  That's a feature of
this notation, so instrumenters like gdb and uprobes can put a
breakpoint there, and don't have to singlestep anything.  And that's a
side thought I had, would it be easier for rewriters like dyninst if
that were a 5-byte NOP?  (Not everything we care about in systemtap will
have a NOP in place, but we do control the SDT implementation.)

Anyway, you can see that __longjmp does not have a normal return path.
The whole setjmp/longjmp implementation in glibc is hand-crafted asm
with jumps like this.  And if you think about the weird behavior that
setjmp/longjmp specify, it's not too surprising, but I can see how this
might be a hard corner case for dyninst.

FWIW, I suspect that this issue has nothing to do with SDT at all, but
just how weird these functions are.  Dyninst would probably also fail to
instrument the entry of __longjmp, but I haven't tried that yet.  I'll
whip up a test and let you know.

>> The old createInstPointAtAddr on 7.0 would just return NULL on the
>> longjmp addresses, so it was completely broken.
> 
> Interesting. A lot has changed, including the entire underlayer for
> instrumentation points, so I can't say I'm surprised that we can now
> create things. I'm annoyed that it doesn't seem to work, though. 

Given the intertwined nature of these functions' implementation,
including manually saved and restored program state, I think this can
just be chalked up as a known issue.  If you are making any assumptions
about calling ABI, it's almost certainly violated by setjmp/longjmp.  I
suggest filing a bug (please CC me), and I'll be happy if it's resolved,
but don't let it block other progress.


Josh
[← Prev in Thread] Current Thread [Next in Thread→]