Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Startd segment violation
- Date: Wed, 02 Feb 2005 17:14:45 +0000
- From: Mark Calleja <mcal00@xxxxxxxxxxxxx>
- Subject: [Condor-users] Startd segment violation
Hi,
A user's application keeps exiting with the following message in the
SchedLog on the submitting machine:
2/2 16:56:10 Shadow pid 12591 for job 2605.0 exited with status 4
2/2 16:56:10 ERROR: Shadow exited with job exception code!
However, the job then gets immediately resubmitted, leading to a
perpetual cycle. The StarterLog on the execute machine shows nothing
unusual, but the StartLog reports:
2/2 16:56:10 Starter pid 19086 died on signal 11 (signal 11).
That's a segment violation there. My question is, is that Condor's way
of telling me that the user's application is segmenting, or the Start
daemon itself? We see this behaviour on a number of linux boxes, all
running dynamically linked versions of Condor 6.6.8 (seen it with 6.6.7
too), for glibc 2.2 and 2.3.
Help please, chaps.
Cheers,
Mark