[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Standalone checkpointing problem
- Date: Fri, 03 Jun 2005 11:49:33 +0100
- From: Alan.Carriou@xxxxxx (Alan Carriou)
- Subject: [Condor-users] Standalone checkpointing problem
Hello all,
I am new to Condor, and trying to use the Condor standalone
checkpointing library (to later integrate it within our Grid Engine
cluster); and I have a problem, which solution I couldn't find in the
doc or in the mailing-list archives...
After succesful compilation of a simple example program "ever", i launch
it, send it a USR2 signal, terminate it, without problem. But when i
restart it from the checkpoint file, the name shown in "ps" is "
i686 ./ever", which seems weird.
[acarrio@localhost ~] $ ./ever &
[1] 11254
Condor: Notice: Will checkpoint to ./ever.ckpt
Condor: Notice: Remote system calls disabled.
[acarrio@localhost ~] $ killall -s USR2 ever
[acarrio@localhost ~] $ killall ever
[acarrio@localhost ~] $ ./ever -_condor_restart ever.ckpt &
[2] 11257
[1] Terminated ./ever
Condor: Notice: Will restart from ever.ckpt
[acarrio@localhost ~] $ ps u | grep ever | grep -v grep
acarrio 11257 99.9 0.4 2028 1084 pts/3 R 11:30 0:05
i686 ./ever
I am using Fedora Core 3, with the Condor 6.6.9 RPM, and gcc 3.4.2 as
provided by FC3. The CPU, in case of being relevant, is AMD Sempron...
Also, making other tests, I have seen about similar strange process
names for other programs, and some checkpoint file names had strange
characters.
Is it a configuration issue (I haven't changed the condor configuration
file), or something else ?
Thanks in advance for any idea or explanation (or link to a part of a
doc I have not found),
Alan