I am attempting to render a Maya scene file. I have 3 physical computers and 12 virtual machines in my pool. The physical attributes of the machines are identical. I'm using condor_render.exe to produce and submit the jobs to condor. If I render more than 4 frames, some of the rendered images do not show up. For example, if I render 45 frames, only about 20 images show up. I have narrowed the problem down to the jobs rendered on the slave computers. These jobs return a value of 203 as indicated in the log files below. Jobs rendered on the master return a value of 0.
According to the starter log on the slave machine (included below), everything appears the same as on the master starter log until we get down to the line, fourth from the bottom:
Can anyone tell me what return value 203 means? What steps should I take to correct the problem.
Any help greatly appreciated. Thank you.
...
005 (002.003.000) 05/07 05:27:16 Job terminated.
(1) Normal termination (return value 203)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
33 - Run Bytes Sent By Job
63428 - Run Bytes Received By Job
33 - Total Bytes Sent By Job
63428 - Total Bytes Received By Job
...
005 (002.000.000) 05/07 05:27:17 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:01, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:01, Sys 0 00:00:00 - Total Remote Usage
Us!
r 0
00:00:00, Sys 0 00:00:00 - Total Local Usage
20594 - Run Bytes Sent By Job
63428 - Run Bytes Received By Job
20594 - Total Bytes Sent By Job
63428 - Total Bytes Received By Job
...
5/7 05:27:02 DaemonCore: Command received via UDP from host <10.100.4.8:4217>
5/7 05:27:02 DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)
5/7 05:27:02 Sent ad to central manager for
Anim@xxxxxxxxxx5/7 05:27:02 Called reschedule_negotiator()
5/7 05:27:02 Activity on stashed negotiator socket
5/7 05:27:02 Negotiating for owner:
Anim@xxxxxxxxxx5/7 05:27:02 Checking consistency running and runnable jobs
5/7 05:27:02 Tables are consistent
5/7 05:27:04 Out of jobs - 5 jobs matched, 0 jobs idle, flock level = 0
5/7 05:27:07 Started shadow for job 2.0 on "<10.100.4.6:4472>", (shadow pid = 2628)
5/7 05:27:07 Sent ad to central manager for
Anim@xxxxxxxxxx5/7 05:27:09 Started shadow for job 2.1 on "<10.100.4.6:4472>", (shadow pid = 2208)
5/7 05:27:11 Started shadow for job 2.2!
on
"<10.100.4.6:4472>", (shadow pid = 2920)
5/7 05:27:13 Started shadow for job 2.3 on "<10.100.4.8:2484>", (shadow pid = 3816)
5/7 05:27:15 Started shadow for job 2.4 on "<10.100.4.6:4472>", (shadow pid = 4076)
5/7 05:27:15 Sent ad to central manager for
Anim@xxxxxxxxxx5/7 05:27:16 DaemonCore: Command received via UDP from host <10.100.4.8:4264>
5/7 05:27:16 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())
5/7 05:27:16 Shadow pid 3816 for job 2.3 exited with status 100
5/7 05:27:16 match (<10.100.4.8:2484>#2627232347) out of jobs (cluster id 2); relinquishing
5/7 05:27:16 Sent RELEASE_CLAIM to startd on <10.100.4.8:2484>
5/7 05:27:16 Match record (<10.100.4.8:2484>, 2, -1) deleted
5/7 05:27:16 DaemonCore: Command received via TCP from host <10.100.4.8:4267>
5/7 05:27:16 DaemonCore: received command 443 (VACATE_!
SERVICE),
calling handler (vacate_service)
5/7 05:27:16 Got VACATE_SERVICE from <10.100.4.8:4267>
5/7 05:27:17 DaemonCore: Command received via UDP from host <10.100.4.8:4274>
5/7 05:27:17 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())
5/7 05:27:17 Shadow pid 2628 for job 2.0 exited with status 100
5/7 05:27:17 match (<10.100.4.6:4472>#2785286360) out of jobs (cluster id 2); relinquishing
5/7 05:27:17 Sent RELEASE_CLAIM to startd on <10.100.4.6:4472>
5/7 05:27:17 Match record (<10.100.4.6:4472>, 2, -1) deleted