Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Condor jobs terminated by SIGINT
- Date: Wed, 12 Jan 2005 18:29:55 -0600
- From: "David A. Kotz" <dkotz@xxxxxxxxxxxxx>
- Subject: [Condor-users] Condor jobs terminated by SIGINT
I've received the following complaint from some of my Condor users:
Over the past couple weeks, a number of Condor jobs have been terminated
with a SIGINT (interrupt signal, as by a keyboard ^C). They are logged
in the program output as follows:
parser_rids_trips_December_23/PROG_OUT.txt
Simulator interrupted with PC = 0x182480 <put_match_list$2>
swim_rids_trips_December_23/PROG_OUT.txt
Simulator interrupted with PC = 0x283500 <raise$2>
parser_rids_trips_December_29/PROG_OUT.txt
Reading the dictionary files: **Simulator interrupted with PC =
0x4dc900 <ra
bridged_lookup$4>
gzip_rids_trips_January_03/PROG_OUT.txt
Simulator interrupted with PC = 0x172580 <compress_block$57>
mcf_rids_trips_January_03/PROG_OUT.txt
Simulator interrupted with PC = 0x115380 <memset$6>
parser_rids_trips_January_03/PROG_OUT.txt
Reading the dictionary files: ***Simulator interrupted with PC =
0x571a00 <f
getc_unlocked$3>
I can't explain this behavior because:
- The Condor jobs were all compiled with the Condor libraries.
- Most or all of them were terminated while I was out of the office.
- I can't cause this behavior to occur when I condor_rm jobs--an outside
agent seems to be doing it.
Does anyone have a suggestion as to why these processes would have been
interrupted? They were running on dedicated compute nodes in a cluster,
and the users submitting the jobs have the highest priority available in
my RANKing scheme.
--
David A. Kotz <dkotz@xxxxxxxxxxxxx>