Hi all:
I am working on Condor
6.7.19. My problem is : job is always kicked off from the
node that it is executing on and thus can only execute on the node from which it
is submitted. In the StarterLog of the remote node,I can
see : EXEC of user process failed, probably
insufficient swap,can angbody in anvance help in this
problem?
Thank you for your
help!
Best Wishes!
Here is some content of StarterLog of the remote
node
10/19 12:21:56 ********** STARTER starting up ***********
10/19 12:21:56
** $CondorVersion: 6.7.19 May 10 2006 $
10/19 12:21:56 ** $CondorPlatform:
I386-LINUX_RH9 $
10/19 12:21:56
******************************************
.................
10/19 16:00:22 Started
user job - PID = 21251
10/19 16:00:22 cmd_fp = 0x8382588
10/19
16:00:22 end
10/19 16:00:22 *FSM* Transitioning to state
"SUPERVISE"
10/19 16:00:22 *FSM* Executing state func "supervise_all()"
[ GET_NEW_PROC SUSPEND VACATE ALARM DIE CHILD_EXIT PERIODIC_CKPT
]
10/19 16:00:22 *FSM* Got asynchronous event "CHILD_EXIT"
10/19
16:00:22 *FSM* Executing transition function "reaper"
10/19 16:00:22 Process 21251 exited with status 110
10/19
16:00:22 EXEC of user process failed, probably insufficient swap
10/19
16:00:22 *FSM* Transitioning to state "PROC_EXIT"
10/19 16:00:22
*FSM* Executing state func "proc_exit()" [ DIE ]
Yufang Zhang
2006-10-19