Re: [DynInst_API:] Unable to instrument chromium


Date: Fri, 21 Nov 2014 11:47:33 -0600
From: Bill Williams <bill@xxxxxxxxxxx>
Subject: Re: [DynInst_API:] Unable to instrument chromium
On 11/18/2014 11:42 AM, Josh Stone wrote:
On 11/18/2014 08:40 AM, bill wrote:
...well, that's a problem. Really, that's two problems. I'll see what I
can do about these, but it may take me a bit--I'm at SC this week.

We're trying to parse the entire process at creation time in order to
install pre-fork instrumentation. The attentive Dyninst developer will
note two problems with that sentence:

1) "entire process"
2) "pre-fork instrumentation"

We should only be looking for fork in the appropriate places (libc, or
the executable itself if we're in a statically linked environment)
rather than the whole process,

It's possible in theory for a fork to arise elsewhere, but given that
syscall-linux.C is only looking for "__libc_fork", reducing the search
is probably fine.

and we should be able to get pre-fork out
of ptrace callbacks and not need instrumentation for it (I'm sure Matt
and/or Josh will correct me if I'm wrong there).

PTRACE_EVENT_FORK is when fork is about to return, and I believe we do
use it for post-fork callbacks.  You could use PTRACE_SYSCALL to get the
pre-fork, but you'd have to stop all syscalls all the time.


AFAICS, the only use of this instrumentation trickles down to
BPatch::registerForkingProcess() to call the user's preForkCallback.  So
could we be lazy and wait until that callback is actually requested?

It seems to me that we could (and should) both be lazy and use ptrace. My logic:

* Finding __libc_fork by name for pre-fork, but getting ptrace events for post-fork, means that if someone writes a custom fork wrapper (or strips __libc_fork's symbol out of their static binary), we'd hand back post-fork with no corresponding pre-fork. That's counterintuitive behavior.

* Enabling PTRACE_SYSCALL if and only if a preForkCallback is registered keeps the overhead on people who've asked for a difficult thing.

* Both of these move parsing from "necessary part of create" to "as needed by instrumentation", which in particular allows users to potentially avoid parsing libc entirely. This should be, generally speaking, a startup speed win.

Is anyone on the list using pre-fork callbacks? Does my logic above make sense and work with your use cases, if so?

--bw

In fact that's already noted in PCProcess::bootstrapProcess():
// TODO
// pre-fork and pre-exit should depend on whether a callback is defined
//
// This will require checking whether BPatch holds a defined callback and also
// adding a way for BPatch enable this instrumentation in all processes when
// a callback is registered

That still leaves a constant breakpoint for pre-exec, whose only job is
to make proc->isExecing() work.  I'm not sure if that one could be
determined some other way.  That seems to exist just to avoid removing
instrumentation from the process that went away, since it's now a "new"
process with different memory.



--
--bw

Bill Williams
Paradyn Project
bill@xxxxxxxxxxx
[← Prev in Thread] Current Thread [Next in Thread→]