Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Standalone checkpoint error ...
- Date: Fri, 3 Feb 2006 15:29:44 +0000 (WET)
- From: Goncalo Borges <goncalo@xxxxxx>
- Subject: [Condor-users] Standalone checkpoint error ...
Hello everybody,
I'm trying to use the standalone checkpoint features provided by condor in
our cluster. Here are the features of our machines:
[goncalo@lflip02 ~]$ uname -a
Linux lflip02.lip.pt 2.4.21-32.0.1.ELsmp #1 SMP Wed May 25 15:42:26 CDT
2005 i686 i686 i386 GNU/Linux
I have downloaded the condor-6.6.10-linux-x86-glibc23.tar package and
installed it in personel mode just to have acess to the compiler:
[goncalo@lflip02 ~]$ ./condor_configure
--install=/home/na50/goncalo/condor-6.6.10/release.tar
--install-dir=/home/na50/goncalo/local/condor-6.6.10
--make-personal-condor
For testing, I'm using a very simple program (ever.c):
[goncalo@lflip02 ~]$ cat ever.c
#include <stdio.h>
int main(void)
{
float x;
long i;
for (;;)
{
for (i=0;i<=100000;i++)
x=3.1415926*i+i+i*i*2.7182818;
}
return 0;
}
I have compiled the ever.c program:
[goncalo@lflip02 ~]$ condor_compile gcc ever.c -o ever
LINKING FOR CONDOR : /usr/bin/ld
-L/home/na50/goncalo/local/condor-6.6.10/lib -Bstatic --eh-frame-hdr -m
elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o ever
/home/na50/goncalo/local/condor-6.6.10/lib/condor_rt0.o
/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/../../../crti.o
/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/crtbeginT.o
-L/home/na50/goncalo/local/condor-6.6.10/lib
-L/usr/lib/gcc-lib/i386-redhat-linux/3.2.3
-L/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/../../.. /tmp/cciyPAqZ.o
/home/na50/goncalo/local/condor-6.6.10/lib/libcondorzsyscall.a
/home/na50/goncalo/local/condor-6.6.10/lib/libz.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libstdc++.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc_eh.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc_eh.a -lc
-lnss_files -lnss_dns -lresolv -lc -lnss_files -lnss_dns -lresolv -lc
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc_eh.a
/home/na50/goncalo/local/condor-6.6.10/lib/libcomp_libgcc_eh.a
/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/crtend.o
/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/../../../crtn.o
/home/na50/goncalo/local/condor-6.6.10/lib/libcondorzsyscall.a(condor_file_agent.o)(.text+0x250):
In function `CondorFileAgent::open(char const*, int, int)':
/home/condor/execute/dir_16550/userdir/src/condor_ckpt/condor_file_agent.C:99:
the use of `tmpnam' is dangerous, better use `mkstemp'
There is no error message, so I gess this is normal.
When I test the program interactively, it stars running with
the right messages:
[goncalo@lflip02 ~]$ ./ever
Condor: Notice: Will checkpoint to ./ever.ckpt
Condor: Notice: Remote system calls disabled.
Then, after login in in other console, I do a "kill -s USR2 <pid>".
The programs is stopped with a segmentation fault error and it creates a
ever.ckpt.tmp file.
[goncalo@lflip02 ~]$ ./ever
Condor: Notice: Will checkpoint to ./ever.ckpt
Condor: Notice: Remote system calls disabled.
Segmentation fault (core dumped)
Then, I try to restart the program using the ever.ckpt.tmp file but it is
immediatelly killed.
[goncalo@lflip02 ~]$ ./ever -_condor_restart ever.ckpt.tmp
Condor: Notice: Will restart from ever.ckpt.tmp
Killed
I guess this is not the expected behaviour. Maybe there is an obvious
reason why this is happening, which I'm forgetting.
Thanks in advance.
Goncalo