Mailing List Archives
	Authenticated access
	
	
     | 
    
	 
	 
     | 
    
	
	 
     | 
  
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] GAHP and proxy
- Date: Fri, 16 May 2008 00:52:56 +0200
 
- From: Jan Ploski <Jan.Ploski@xxxxxxxx>
 
- Subject: Re: [Condor-users] GAHP and proxy
 
Barnett P. Chiu wrote:
Hi, Jan,
     Thanks for the reply.
 
     I ran strace -f to condor_submit command and condor_gridmanager 
seemed to have opened the right proxy as show on the first line below:
I suggest that you strace the condor_gridmanager process (forked by 
condor_schedd, I think) rather than condor_submit, which just 
communicates with condor_schedd. That additional strace may reveal 
failed system calls related to directory/file access. You can attach 
with strace -f -p <pid> to the condor_schedd process. You can also 
replace the condor_gridmanager executable and/or the java binary (the 
path of which is configurable in Condor) with a little shell script 
wrapper which does something like "exec strace -f $* &> /tmp/strace.log".
I suspect that your problems are related to incorrect access permissions 
on some directory (or the parent directory of some directory). If you 
could see EACCES in the strace output, that would be a big clue. Your 
remarks that it works with one user but not another also support this 
hypothesis.
Looking at the GridmanagerLog again, I do see a return code 7, which 
indicates a failure of activating globus module in the process of search for CA?
The bug report I previously mentioned had something about CA search, but 
I don't know precisely what this message means.
Anyway to intercept the proxy in the scratch directory before it disappear?
You could suspend the condor_gridmanager process with "gdb -p <pid>" 
("detach", "quit" to make it continue), but it may be difficult to hit 
the sweet spot or step through if your Condor doesn't have debugging 
symbols.
If you're on Linux, you could also override certain library calls (that 
would require a wrapper script mentioned above) to add artificial delay 
(then the problem would be not to delay the wrong calls too much...)
/**
 * Intercepts calls to close/fclose.
 * Compile with:
 *   gcc -fPIC -c -o hijack.o hijack.c
 *   gcc -shared -o hijack.so hijack.o -ldl
 * Usage: env LD_PRELOAD=./hijack.so <some program>
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
static int (*_fclose)(FILE *f);
static int (*_close)(int fd);
static FILE* fl;
int close(int fd)
{
    if (_close == NULL)
        _close = (int (*)(int fd)) dlsym(RTLD_NEXT, "close");
    if (!fl) { fl = fopen("/tmp/hijack.log", "a"); }
    fprintf(fl, "close(%d)\n", fd);
    fflush(fl);
    return _close(fd);
}
int fclose(FILE* f)
{
    if (_fclose == NULL)
        _fclose = (int (*)(FILE *f)) dlsym(RTLD_NEXT, "fclose");
    if (!fl) { fl = fopen("/tmp/hijack.log", "a"); }
    fprintf(fl, "fclose(%d)\n", fileno(f));
    fflush(fl);
// uncomment to delay 10 seconds here...
//    sleep(10);
    return _fclose(f);
}
Regards,
Jan Ploski