Hi,
I found a few issues in the way Dyninst is parsing #! lines, and wrote
the attached patch to hopefully make it more robust. Please let me know
if it needs any adjustment.
It's also possible for such scripts to reference yet another #! script
for the interpreter (in Linux, up to BINPRM_BUF_SIZE=128 levels deep).
I didn't write that recursion yet, as I doubt it's very common, but that
might be a good followup.
Thanks,
Josh
>From e44d4973997c911928b6d5f23fc19fce588884c1 Mon Sep 17 00:00:00 2001
From: Josh Stone <jistone@xxxxxxxxxx>
Date: Wed, 31 Oct 2012 19:22:33 -0700
Subject: [PATCH] Improve script #! parsing
This fixes a few shortcomings in BPatch's buildPath():
- The original path and argv[0] may not necessarily be the same, but the
former should replace the latter in the new argv list.
- The #! line may optionally include a single argument for the
interpreter, often used like "#!/usr/bin/env python".
- The NULL to terminate the new argv was clobbering the last argument
from the original argv.
I modeled the exact #!-parsing details after Linux's fs/binfmt_script.c.
---
dyninstAPI/src/BPatch.C | 59 +++++++++++++++++++++++++++++++++++++++----------
1 file changed, 47 insertions(+), 12 deletions(-)
diff --git a/dyninstAPI/src/BPatch.C b/dyninstAPI/src/BPatch.C
index d6f417d..e3c9bef 100644
--- a/dyninstAPI/src/BPatch.C
+++ b/dyninstAPI/src/BPatch.C
@@ -1070,21 +1070,56 @@ static void buildPath(const char *path, const char **argv,
}
// A shell script, so reinterpret path/argv
- std::string interp = line.substr(2);
- pathToUse = (char *) malloc(interp.length()+1);
- strncpy(pathToUse, interp.c_str(), interp.length()+1);
- // I'd prefer an argc, but hey
- int count = 0;
- while(argv[count] != NULL) {
- count++;
+
+ // Modeled after Linux's fs/binfmt_script.c
+ // #! lines have the interpreter and optionally a single argument,
+ // all separated by spaces and/or tabs.
+
+ size_t pos_start = line.find_first_not_of(" \t", 2);
+ if (pos_start == std::string::npos) {
+ file.close();
+ return;
+ }
+ size_t pos_end = line.find_first_of(" \t", pos_start);
+ std::string interp = line.substr(pos_start, pos_end - pos_start);
+ pathToUse = strdup(interp.c_str());
+
+ std::string interp_arg;
+ pos_start = line.find_first_not_of(" \t", pos_end);
+ if (pos_start != std::string::npos) {
+ // The argument goes all the way to the last non-space/tab,
+ // even if there are spaces/tabs in the middle somewhere.
+ pos_end = line.find_last_not_of(" \t") + 1;
+ interp_arg = line.substr(pos_start, pos_end - pos_start);
+ }
+
+ // Count the old and new argc values
+ int argc = 0;
+ while(argv[argc] != NULL) {
+ argc++;
+ }
+ int argcToUse = argc + 1;
+ if (!interp_arg.empty()) {
+ argcToUse++;
+ }
+ argvToUse = (char **) malloc((argcToUse+1) * sizeof(char *));
+
+ // The interpreter takes the new argv[0]
+ int argi = 0;
+ argvToUse[argi++] = strdup(pathToUse);
+
+ // If there's an interpreter argument, that's the new argv[1]
+ if (!interp_arg.empty()) {
+ argvToUse[argi++] = strdup(interp_arg.c_str());
}
- argvToUse = (char **) malloc((count+1) * sizeof(char *));
- argvToUse[0] = strdup(pathToUse);
- for (int tmp = 0; tmp < count; ++tmp) {
- argvToUse[tmp+1] = strdup(argv[tmp]);
+ // Then comes path, *replacing* the old argv[0],
+ // and the old argv[1..] are filled in for the rest
+ argvToUse[argi++] = strdup(path);
+ for (int tmp = 1; tmp < argc; ++tmp) {
+ argvToUse[argi++] = strdup(argv[tmp]);
}
- argvToUse[count] = NULL;
+ argvToUse[argcToUse] = NULL;
file.close();
}
--
1.7.11.7
|