Apologies for jumping in.
I have implemented forward-edge CFI invariants
similar to what you need very recently with Dyninst for our
S&P paper [1]. The trick we use is to add a NOP instruction
to every function which basically moves them to a shadow space.
This allows us to overwrite the existing code space with a tag,
using one of the lower process interfaces. A problem that arises
is that not all code is moved: indirect jumps, for example, may
jump back to the original code space. This means that the tag
may still overwrite part of the function that lies before the
function entry that you are tagging. I found that using only 2
byte tags did not break our programs as such indirect jump
target would usually call or jump back to a function in the
shadow space (tested on MySQL and node.js).
You may want to look into a similar direction. Our
code should be open sourced at some point, but it is uncertain
when exactly.
Best,
Victor
[1] http://vvdveen.com/publications/TypeArmor.pdf