Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] standalone checkpointing segmentation fault
- Date: Wed, 10 Nov 2004 10:36:52 -0800 (PST)
- From: Jason Crane <Jason.Crane@xxxxxxxxxxxxx>
- Subject: [Condor-users] standalone checkpointing segmentation fault
Hi,
I'm trying to use the condor (6.6.7, dynamic) in
"standalone" mode to checkpoint jobs. I have a very simple
test program that I compiled with condor_compile on both
Solaris & RH. During a run I "kill -TSTP pid" the job. On
Solaris this works fine producing a *.ckpt file that I can
use to restart the job using the "-_condor_restart" flag and
the *.ckpt file. However, on Linux I get a segmentation
fault upon kill -TSTP pid, and only a core and *ckpt.tmp
file are generated. For what it's worth, if I open the core
file in a debugger it shows:
Program terminated with signal 11, Segmentation fault.
#0 0x0809ac28 in adler32 ()
(gdb) where
#0 0x0809ac28 in adler32 ()
#1 0x08096316 in fill_window ()
#2 0x08096101 in deflate_slow ()
#3 0x08095127 in deflate ()
#4 0x0804f717 in SegMap::Write (this=0x8188464, fd=3,
pos=1024) at image.C:1446
#5 0x0804eef8 in Image::Write (this=0x81880a0, fd=3) at
image.C:1097
#6 0x0804ebcf in Image::Write (this=0x81880a0,
ckpt_file=0x99879c8 "./condor_test.ckpt")
at image.C:1003
#7 0x0804ea4e in Image::Write (this=0x81880a0) at
image.C:928
#8 0x0804fdef in Checkpoint (sig=20, code=0, scp=0x0) at
image.C:1694
#9 <signal handler called>
#10 0x08048220 in main ()
Thanks in advance for any advice,
Jason