Re: [DynInst_API:] RFC: instpoint-level malloc interface (was Re: Adding raw bytes before function)


Date: Wed, 31 Aug 2016 19:00:05 +0200
From: Matthias Fischer <fischmat@xxxxxxxxx>
Subject: Re: [DynInst_API:] RFC: instpoint-level malloc interface (was Re: Adding raw bytes before function)
Hi,

Thank you for your response. I just looked around the code for CFWidget
and RelocBlock where I found the class PaddingPatch, which would
introduce Trap filled space (with the option for NOPs, which apparently
is not used(?)). I think I could use that to store my information there,
or do you see a potential problem regarding with that ? (or was that
where you were hinting at ?)

If that is possible, I just need to think of some way to get my data
there - probably an extension of the CFWidget and derive the information
directly from the function there (the function entry RelocBlock should
have the ability to get the corresponding BPatch_function, right ?).
However, that requires a forced relocation to have an effect, so I would
have to add a NOP snippet to each function that needs data added, or
does a call to markModified() for the instPoint suffice there ?

Thanks,
Matthias

Am 31.08.2016 um 00:22 schrieb Bill Williams:
> Responses inline. I do want folks with thoughts on these interfaces to chime in; this is not just a question of how to solve a specific CFI problem, but a general "how can we create instpoint-level locals that are intuitive?" question. We'll be talking about the general issues here at UW this week, but user input is pretty important too.
>
> ________________________________________
> From: Matthias Fischer <fischmat@xxxxxxxxx>
> Sent: Tuesday, August 30, 2016 4:33 PM
> To: Bill Williams; Victor van der Veen
> Cc: dyninst-api@xxxxxxxxxxx
> Subject: Re: [DynInst_API:] Adding raw bytes before function
>
> Hi,
>
> thank you for your answer, I have looked into what you proposed and have
> questions/comments regarding that:
>
> Wouldn't BPatch_storageAddr fit better than BPatch_storageRegOffset,
> because pc-relativeness does not really help very much ?
> The only place where I need the BPatch_variableExpr itself is when I
> write the value to it (probably in the initCallback of the mutatee, I do
> not intend to access the variable from within the function it is created
> in). As of now I need a position relative to the function address, so
> addressing (from the callsite) can be done without more than knowledge
> of the function and the uniform offset (or did I miss something here ?).
>
> Upon further reflection none of the existing addressing modes are really a good fit; what you want to specify is that this is a variable *moving with code*. The label mechanism in relocation handles this for control flow, and creating a labelled addressing mode that works for data references is probably the right general solution. This needs more thought (though that shouldn't stop you from producing an implementation).
>
> I assume that I have to extend instPoint by something along the lines of
> addSpaceForVariable(Type), which marks space as used for the variable
> (does there already exist such functionality or am I on the wrong track
> here ?) and returns an address that can be used for the
> BPatch_variableExpr with BPatch_storageAddr. However, instPoint itself
> does not really handle any addresses, as that is done by the baseTramp
> (so I probably need a future/promise or something like that), which
> would introduce the requirement of a second patch/relocate run (or is
> there another way ?).
>
> There's not currently that functionality; you'd be adding it. What it should return is a series of ASTs specifying how to access the variable, similar to how local variables are created, and it would presumably allocate variables in order in the space prior to code.
>
> Furthermore it seems to me that it would be a bad idea to actually use
> the space within the function boundaries, as the data within should not
> be executed  (or should I expand the space "upwards" to before the
> actual position, which would require notification to the block that is
> address-wise before the current one, or is that solved in another way ?
> my knowledge of dyninst internals is rather fuzzy.)
>
> I'm assuming widgets here, and giving a couple quick examples:
>
> 1) function entry instrumentation adds a relocblock for the instrumentation @ entry, with a CFWidget attached that directs it to the original entry point. After that CFWidget, you'd add a widget that generates N bytes of padding to the instrumentation block. The branch around the padding is created by the CFWidget, then the padding is laid out, then the subsequent blocks are laid out.
> 2) your transformer adds a padding widget at the beginning of function entry blocks. The padding widget generates N bytes of padding (again). You'd need to tweak the relocblock::generate sequence such that the padding widget moves the block's label to the start of actual code and skips the data, but from there the rest of the system should take over and do the right thing.
>
> I hope I am on the right track here.
>
> It sounds like you are. It may be helpful to run some simple mutators with DYNINST_DEBUG_RELOC=1 in your environment and look through the logs (also use a simple, small mutatee for this); that should give you a better feel for how, exactly, we're doing the control flow magic. (And Drew Bernat's papers on the subject are authoritative--he wrote the current Dyninst relocation system as part of his dissertation work.)
>
> --bw
>
> Thanks,
> Matthias
>
> Am 30.08.2016 um 19:32 schrieb Bill Williams:
>> Okay, I've been through the code and I see a pretty simple approach that doesn't involve as much digging in the internals as I had thought--but it does involve extending Dyninst. I do think this is a good extension and would be really happy to get patches back that implement this.
>>
>> What we do is add a malloc() method to BPatch_point, taking a name/type, marking the internal instPoint as modified, and returning a BPatch_variableExpr. From there, life becomes moderately simple:
>>
>> * The variableExpr needs to be created with appropriate internals: probably as a BPatch_storageRegOffset that's pc-relative
>> * The instPoint needs to add space for all allocated variables at the beginning of its baseTramp's generated code
>> * We need to ensure that references to the malloced variables get proper Widgets applied such that they're filled in when we actually generate the BaseTramp containing the variables. This may involve creating new Widgets.
>>
>> Hope this helps.
>>
>> --bw
>>
>>
>> ________________________________________
>> From: Matthias Fischer <fischmat@xxxxxxxxx>
>> Sent: Monday, August 29, 2016 7:01 PM
>> To: Victor van der Veen
>> Cc: Bill Williams; dyninst-api@xxxxxxxxxxx
>> Subject: Re: [DynInst_API:] Adding raw bytes before function
>>
>> Hi,
>>
>> no problem and thank you for the idea with the NOP instruction to force relocation. I am familiar with your paper, there you use the label REX; INT3; INT3; INT3; to store data in the lower 4 bits of REX (by the way, is it possible to omit two of those INT3 instructions or use REX; REX; REX; INT3 ?). However, I either need full 5 bytes worth of space to directly store the information or if that is not possible, at least 2 full bytes to use as an offset in a data lookup table. So I am still interested in the approach that introduces a variable for each function during relocation.
>>
>> Thanks,
>> Matthias
>>
>>
>> Am 30.08.2016 um 01:13 schrieb Victor van der Veen:
>>
>> Apologies for jumping in.
>>
>> I have implemented forward-edge CFI invariants similar to what you need very recently with Dyninst for our S&P paper [1]. The trick we use is to add a NOP instruction to every function which basically moves them to a shadow space. This allows us to overwrite the existing code space with a tag, using one of the lower process interfaces. A problem that arises is that not all code is moved: indirect jumps, for example, may jump back to the original code space. This means that the tag may still overwrite part of the function that lies before the function entry that you are tagging. I found that using only 2 byte tags did not break our programs as such indirect jump target would usually call or jump back to a function in the shadow space (tested on MySQL and node.js).
>>
>> You may want to look into a similar direction. Our code should be open sourced at some point, but it is uncertain when exactly.
>>
>> Best,
>> Victor
>>
>> [1] http://vvdveen.com/publications/TypeArmor.pdf
>>
>> On Aug 30, 2016 00:41, "Matthias Fischer" <fischmat@xxxxxxxxx<mailto:fischmat@xxxxxxxxx>> wrote:
>> Hi,
>>
>> For the first variant I only see some type of associative container as a
>> solution there (something along the line of switch case would add too
>> much overhead) and even then that looks quite expensive and non trivial
>> (hash based might be lowest overhead but not so simple, tree based would
>> also fall in the non trivial category, binary search on sorted pairs
>> would be possibly the easiest of the three, but still introduces a non
>> negligible overhead). Even if I use C/C++ code for the mapping function
>> (then all associative containers would be rather trivial), the overhead
>> would probably be still too high.
>>
>> The second variant looks promising, as long as all functions are
>> relocated, which I assume is the case ("we're redoing the layout of the
>> whole function anyway"), so I would appreciate some hints, where to look
>> to introduce changes.
>>
>> Thanks,
>> Matthias
>>
>> Am 30.08.2016 um 00:09 schrieb Bill Williams:
>>> Okay, I thought this looked like CFI; glad I'm not completely nuts. There are a couple of ways to do CFI in Dyninst without too much effort, I think. The first way would be a central map of [function entry address->tag] that's constructed in the mutator based on analysis and where you're inserting tags, and checked in the mutatee at any indirect control flow you want to validate. Doesn't need to be near anything, just needs to provide a map from target address->tag value. If you need locality, though, that doesn't work. The second, if you want to ensure that tags are at a fixed location with respect to the function entry point, would be to tweak the relocation classes to automatically add a tag variable to each function during relocation--we're redoing the layout of the whole function anyway. If that sounds more promising to you, I can give you some pointers to the right places to poke.
>>>
>>> --bw
>>>
>>>
>>> ________________________________________
>>> From: Matthias Fischer <fischmat@xxxxxxxxx<mailto:fischmat@xxxxxxxxx>>
>>> Sent: Monday, August 29, 2016 4:43 PM
>>> To: Bill Williams; dyninst-api@xxxxxxxxxxx<mailto:dyninst-api@xxxxxxxxxxx>
>>> Subject: Re: [DynInst_API:] Adding raw bytes before function
>>>
>>> Hi,
>>>
>>> What I want is not restricted to the call_site, but to affects both the
>>> call_target and the call_site, as I cannot access the actual call_target
>>> of the call_site during analysis/patching (indirect call_sites):
>>>
>>> For each call_target(function):
>>>     BPatch_snippet tag = make_tag_snippet(call_target); // <- this one
>>> needs to be uniformly accessible during runtime of the mutatee, cannot
>>> be passed during analysis/patching
>>>     insert_snippet(call_target, tag);
>>>
>>>
>>> For each call_site(these are indirect):
>>>     BPatch_arithmeticExpr tag_address =
>>> BPatch_arithmeticExpr(BPatch_add, BPatch_dynamicTargetExpr(),
>>> BPatch_constExpr(offset));
>>>     BPatch_arithmeticExpr tag_value =
>>> BPatch_arithmeticExpr(BPatch_deref, tag_address);
>>>     BPatch_snippet check = make_check_snippet(tag_value, call_site);
>>>     insert_snippet(call_site, check);
>>>
>>> And the call_site needs to access the tag from the call_target, which I
>>> cannot calculate during my analysis/patching (=> dynamicTargetExpr
>>> during runtime of the mutatee), therefore I cannot simply use the tag
>>> expression from the call_target by passing it during analysis/patching.
>>>
>>> Thanks,
>>> Matthias
>>>
>>>
>>> Am 29.08.2016 um 23:21 schrieb Bill Williams:
>>>> If I'm understanding correctly, what you want is something like:
>>>>
>>>> for each call site:
>>>>     BPatch_variableExpr tag = /something/
>>>>     BPatch_snippet do_work = make_snippet(getDynamicTarget(call site), tag, ...)
>>>>     insert_snippet(call site, do_work)
>>>>
>>>> BPatch_malloc will hand back a tag that's guaranteed to be in a safe location, with a known type. It will be in the Dyninst private heap and will not collide with relocated code.
>>>>
>>>> BPatch_createVariable is a placement new, effectively--it will create a variable expression of a given type at the address you hand it. There is no safety checking on that; it's intended to be used either in a region you've created or to point to a heap location. It will not be relocated.
>>>>
>>>> Relocation occurs on a per-function basis and happens when instrumentation is inserted (either at the time of the insert or at the time of insertionset::finalize). Springboards to relocated code in the form of either traps or branches exist for all relocated code.
>>>>
>>>> Is my understanding above correct? If so, can you fill in the details that make BPatch_malloc not the right tool for the job? If not, can you give me pseudocode that explains your use case better?
>>>>
>>>> --bw
>>>>
>>>> ________________________________________
>>>> From: Matthias Fischer <fischmat@xxxxxxxxx<mailto:fischmat@xxxxxxxxx>>
>>>> Sent: Monday, August 29, 2016 2:44 PM
>>>> To: Bill Williams; dyninst-api@xxxxxxxxxxx<mailto:dyninst-api@xxxxxxxxxxx>
>>>> Subject: Re: [DynInst_API:] Adding raw bytes before function
>>>>
>>>> Hi,
>>>>
>>>> thank you for your answer, but I do not think that BPatch_malloc can
>>>> help me there, because every function in the binary will have a
>>>> (possibly) different tag, based on collected information and
>>>> BPatch_malloc does not allow me to control or predict the address as far
>>>> as I can tell.
>>>> Furthermore, I need each indirect call_site to access this tag in a
>>>> uniform way before control is transferred to the call_target (this I can
>>>> achieve with BPatch_dynamicTargetExpr, when the tag is always at the
>>>> same position relative to the call_target's address).
>>>>
>>>> Now, I am assuming that code relocation will occur when a snippet and
>>>> actual code overlap, which will move the existing code into a new
>>>> location (the previous place is probably filled with trap instructions).
>>>> However, to prevent breaking indirect callsites, I am assuming that at
>>>> the previous function location there still needs to be something to
>>>> redirect control to the actual target (probably a directjump). Therefore
>>>> I assume that BPatch_createVariable does not allow all addresses, but
>>>> only those, which are "safe" aka not the jump instruction that keeps
>>>> indirect calls from breaking. Is this correct so far?
>>>> In case the above is correct, is there a way to determine the set of
>>>> "safe" addresses?
>>>>
>>>> Furthermore, will the address I give to BPatch_createVariable always
>>>> contain exactly this variable (short of removing/overwriting it
>>>> manually) or can it be subject to relocation? Are there any other
>>>> pitfalls regarding BPatch_createVariable?
>>>>
>>>> Thanks,
>>>> Matthias
>>>>
>>>> Am 29.08.2016 um 18:02 schrieb Bill Williams:
>>>>> Matthias--
>>>>>
>>>>> Unless you have very specific location constraints, using BPatch_malloc instead of BPatch_createVariable to create and allocate space for a BPatch_variableExpr should handle all of the bookkeeping for you. If you do have constraints, there are internal mechanisms that will help (we try to allocate relocated code near the original, for instance) but those aren't currently exposed; I'd want to understand your use case better before trying to push a constrained-malloc interface out to the public BPatch classes.
>>>>>
>>>>> Let me know if you've got further questions.
>>>>>
>>>>> --bw
>>>>>
>>>>> ________________________________________
>>>>> From: Dyninst-api <dyninst-api-bounces@xxxxxxxxxxx<mailto:dyninst-api-bounces@xxxxxxxxxxx>> on behalf of Matthias Fischer <fischmat@xxxxxxxxx<mailto:fischmat@xxxxxxxxx>>
>>>>> Sent: Monday, August 29, 2016 9:30 AM
>>>>> To: dyninst-api@xxxxxxxxxxx<mailto:dyninst-api@xxxxxxxxxxx>
>>>>> Subject: [DynInst_API:] Adding raw bytes before function
>>>>>
>>>>> Hi,
>>>>>
>>>>> I'd like to write information to a binary (or process) for each function
>>>>> in way that allows access from an indirect callsite with minimal overhead.
>>>>>
>>>>> My current idea is to write raw bytes before the actual function code,
>>>>> so the bytes are stored at [callsite_target - offset, callsite_target[.
>>>>> Then I can access the information as a BPatch_variableExpr using
>>>>> BPatch_dynamicTargetExpr to calculate the address. However, I cannot
>>>>> simply write the bytes at this address without possibly overwriting
>>>>> existing instructions - the code before the function would have to be
>>>>> relocated.
>>>>>
>>>>> So far, I have not found a way to write raw bytes at a specific address
>>>>> with relocation of possible overwritten instructions. Is that even
>>>>> possible with dyninstAPI or is there another way to achieve my initial goal?
>>>>>
>>>>> Thanks,
>>>>> Matthias
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dyninst-api mailing list
>>>>> Dyninst-api@xxxxxxxxxxx<mailto:Dyninst-api@xxxxxxxxxxx>
>>>>> https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
>> _______________________________________________
>> Dyninst-api mailing list
>> Dyninst-api@xxxxxxxxxxx<mailto:Dyninst-api@xxxxxxxxxxx>
>> https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
>>
>


[← Prev in Thread] Current Thread [Next in Thread→]