Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Why was my job evicted?
- Date: Tue, 20 Apr 2010 15:56:14 -0400
- From: Adam Smola <adam.smola@xxxxxxxxx>
- Subject: [Condor-users] Why was my job evicted?
Hello All,
Was running a job that was evicted. In the past I've always been able
to find out what happened, but this time, not so much.
Execute Node:
StartLog:
4/20 15:07:41 Got SIGTERM. Performing graceful shutdown.
4/20 15:07:41 shutdown graceful
4/20 15:07:41 Changing activity: Busy -> Retiring
4/20 15:07:41 State change: claim retirement ended/expired
4/20 15:07:41 Changing state and activity: Claimed/Retiring ->
Preempting/Vacating
4/20 15:07:49 Got KILL_FRGN_JOB while in Preempting state, ignoring.
4/20 15:07:49 Got RELEASE_CLAIM while in Preempting state, ignoring.
4/20 15:07:49 Starter pid 1508 exited with status 0
StarterLog:
4/20 15:07:41 Got SIGTERM. Performing graceful shutdown.
4/20 15:07:41 ShutdownGraceful all jobs.
4/20 15:07:41 Process exited, pid=1300, status=-1073741510
4/20 15:07:49 Last process exited, now Starter is exiting
MasterLog
4/20 15:07:41 Got SIGTERM. Performing graceful shutdown.
4/20 15:07:41 ShutdownGraceful all jobs.
4/20 15:07:41 Process exited, pid=1300, status=-1073741510
4/20 15:07:49 Last process exited, now Starter is exiting
So far it seems like the request came from outside, but on the
Schedd\Shadow end of things...
ShadowLog
4/20 15:07:44 (17040.0) (1556): Job 17040.0 is being evicted from ha2003x86Exec
4/20 15:07:44 (17040.0) (1556): Job 17040.0 is being evicted from ha2003x86Exec
4/20 15:07:44 (17040.0) (1556): **** condor_shadow (condor_SHADOW) pid
1556 EXITING WITH STATUS 107
4/20 15:07:57 Initializing a VANILLA shadow for job 17040.0
4/20 15:07:57 (17040.0) (3988): init_user_ids: failed because user
switching is disabled
ScheddLog
4/20 15:07:44 (pid:2572) Shadow pid 1556 for job 17040.0 exited with status 107
4/20 15:07:44 (pid:2572) Match record (ha2003x86Exec
<10.127.250.34:1045> for adam_smola@hydra, 17040.0) deleted
Any idea as to what happened?
-Adam