Re: [DynInst_API:] [Paradyn-staff] Proposal: Add System Call Events to ProcControlAPI


Date: Fri, 24 May 2013 10:56:01 -0700 (PDT)
From: Matthew LeGendre <legendre1@xxxxxxxx>
Subject: Re: [DynInst_API:] [Paradyn-staff] Proposal: Add System Call Events to ProcControlAPI


On Fri, 24 May 2013, Bill Williams wrote:
<snipping>

Yes, if Dyninst were to provide argument access, we would need to
support that level of semantic information; I definitely agree that
this would be a big task. That added complexity is a large part of
the reason argument access is a hypothetical future feature and not
intended to be part of this initial interface and implementation.

A couple of things.  First, this functionality probably belongs in a
"value added" library that sits on top of Dyninst or its toolkits.  It
doesn't need to go inside.

Second, the semantics of the argument types should be defined by the
syscall number.  It's a bit of work, but mostly one-time work to
define a table of argument number and types for each syscall. And
there would be table version for each platform version of the
library.  An interesting question is whether there is a DWARF-ful
version of libc so that we could build this table automatically? (Or
could we just trigger our own build of libc?)

Or perhaps we can auto-generate it from the kernel source?  That would
allow us (for Linux)  to auto-pull kernels from kernel.org and generate
the configuration files.

With respect to argument access, I would prefer, if possible, to find a way to pull information that Symtab can already parse in order to build the table(s) of arguments. Autogenerating from kernel source is a perfectly valid backup plan...though at that point it might be better to build a kernel with DWARF.

My thinking at the implementation level is that *if* we can turn Symtab loose on this problem somehow, then it becomes a (comparatively) tractable problem of tracking the differences between the syscall ABI and the regular callsite ABI and storing the parameter types. That would allow us to treat syscalls as "just like regular calls, but with a possibly different ABI and no ability to instrument inside."

I don't think it would be too hard to extract this info from kernel source. There's already a central header file that contains every system call and their arguments:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/syscalls.h

One could run some text processing to turn the syscalls.h function declarations into table entries. Then turn the argument types into a Dyninst compatible form, which would vary in difficultly depending on how detailed you want the type info. It wouldn't be hard to turn them into ints, longs, pointers and strings. More difficult if you wanted to represent the contents of structs (which might be easier to extract from DWARF, as Bill suggested).

Another source of this data is the strace tool. I just checked and strace is released with a BSD license, which means Dyninst could incorporate its source. Strace's system call table for linux/x86_64 can be found here:

http://strace.git.sourceforge.net/git/gitweb.cgi?p=strace/strace;a=blob;f=linux/x86_64/syscallent.h

It looks like this table contains the number of arguments to each syscall, a general classification (network call, io call, ...), the system call number, and a string name. Unfortunately, it doesn't include syscall argument types.

-Matt





[← Prev in Thread] Current Thread [Next in Thread→]