Dear All, We’ve just upgraded our pool to Condor-6.8.0 on
RHEL 4 and I’ve noticed some strange messages in schedd.log namely: 9/8 10:09:05 (pid:10265) Attempting to chown
'/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0', but it
doesn't appear to exist. 9/8 10:09:05 (pid:10265) Error: Unable to chown
'/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0' from 502 to
501.501 9/8 10:09:05 (pid:10265) (3.0) Failed to chown /opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0
from 502 to 501.501. User may run into permissions problems when fetching
sandbox. Does anyone have any thoughts on what these messages
are? I’ve just noticed them in this version, they weren’t
present in 6.6.11. Thanks The actual log is below. [schedd.log on submit node] 9/8 10:07:15 (pid:10205)
****************************************************** 9/8 10:07:15 (pid:10205) ** condor_schedd
(CONDOR_SCHEDD) STARTING UP 9/8 10:07:15 (pid:10205) **
/opt/condor-6.8.0/sbin/condor_schedd 9/8 10:07:15 (pid:10205) ** $CondorVersion: 6.8.0 Jul
19 2006 $ 9/8 10:07:15 (pid:10205) ** $CondorPlatform:
I386-LINUX_RHEL3 $ 9/8 10:07:15 (pid:10205) ** PID = 10205 9/8 10:07:15 (pid:10205) ** Log last touched 9/8
10:04:27 9/8 10:07:15 (pid:10205) ****************************************************** 9/8 10:07:15 (pid:10205) Using config source:
/mnt/condor_nfs/dell_optiplex_gx150_config/condor_config 9/8 10:07:15 (pid:10205) Using local config sources: 9/8 10:07:15 (pid:10205)
/opt/condor-6.8.0/local.west153/condor_config.local 9/8 10:07:15 (pid:10205) DaemonCore: Command Socket
at <xxx.xxx.xxx.xxx:56508> 9/8 10:07:15 (pid:10205) History file rotation is
enabled. 9/8 10:07:15 (pid:10205) Maximum history
file size is: 20971520 bytes 9/8 10:07:15 (pid:10205) Number of
rotated history files is: 2 9/8 10:07:17 (pid:10205) Sent ad to central manager
for sjo@xxxxxxxxxx 9/8 10:07:17 (pid:10205) Sent ad to 1 collectors for
sjo@xxxxxxxxxx 9/8 10:09:05 (pid:10205) DaemonCore: Command received
via TCP from host <xxx.xxx.xxx.xxx:54356> 9/8 10:09:05 (pid:10205) DaemonCore: received command
478 (ACT_ON_JOBS), calling handler (actOnJobs) 9/8 10:09:05 (pid:10265) Attempting to chown
'/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0', but it
doesn't appear to exist. 9/8 10:09:05 (pid:10265) Error: Unable to chown
'/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0' from 502 to
501.501 9/8 10:09:05 (pid:10265) (3.0) Failed to chown /opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0
from 502 to 501.501. User may run into permissions problems when fetching
sandbox. 9/8 10:09:11 (pid:10205) DaemonCore: Command received
via UDP from host <xxx.xxx.xxx.xxx:33914> 9/8 10:09:11 (pid:10205) DaemonCore: received command
421 (RESCHEDULE), calling handler (reschedule_negotiator) 9/8 10:09:11 (pid:10205) Sent ad to central manager
for sjo@xxxxxxxxxx 9/8 10:09:11 (pid:10205) Sent ad to 1 collectors for
sjo@xxxxxxxxxx 9/8 10:09:11 (pid:10205) Called
reschedule_negotiator() 9/8 10:09:11 (pid:10205) failed to send RESCHEDULE
command to negotiator 9/8 10:13:07 (pid:10205) DaemonCore: Command received
via TCP from host <xxx.xxx.xxx.xxx:38060> 9/8 10:13:07 (pid:10205) DaemonCore: received command
416 (NEGOTIATE), calling handler (doNegotiate) 9/8 10:13:07 (pid:10205) Negotiating for owner: sjo@xxxxxxxxxx 9/8 10:13:07 (pid:10205) Checking consistency running
and runnable jobs 9/8 10:13:07 (pid:10205) Tables are consistent 9/8 10:13:07 (pid:10205) Out of jobs - 1 jobs
matched, 0 jobs idle, flock level = 0 9/8 10:13:07 (pid:10205) Sent ad to central manager
for sjo@xxxxxxxxxx 9/8 10:13:07 (pid:10205) Sent ad to 1 collectors for
sjo@xxxxxxxxxx 9/8 10:13:09 (pid:10205) Starting
add_shadow_birthdate(4.0) 9/8 10:13:09 (pid:10205) Started shadow for job 4.0
on "<xxx.xxx.xxx.xxx:55886>", (shadow pid = 10304) 9/8 10:13:09 (pid:10205) Shadow pid 10304 for job 4.0
exited with status 100 9/8 10:13:10 (pid:10205) match (<xxx.xxx.xxx.xxx:55886>#1157706435#1)
out of jobs (cluster id 4); relinquishing 9/8 10:13:10 (pid:10205) Sent RELEASE_CLAIM to startd
on <xxx.xxx.xxx.xxx:55886> 9/8 10:13:10 (pid:10205) Match record (<xxx.xxx.xxx.xxx:55886>,
4, -1) deleted 9/8 10:13:10 (pid:10205) DaemonCore: Command received
via TCP from host <xxx.xxx.xxx.xxx:38513> 9/8 10:13:10 (pid:10205) DaemonCore: received command
443 (VACATE_SERVICE), calling handler (vacate_service) 9/8 10:13:10 (pid:10205) Got VACATE_SERVICE from <xxx.xxx.xxx.xxx:38513> 9/8 10:13:12 (pid:10205) Sent owner (0 jobs) ad to 1
collectors 9/8 10:15:39 (pid:10205) DaemonCore: Command received
via UDP from host <xxx.xxx.xxx.xxx:33932> 9/8 10:15:39 (pid:10205) DaemonCore: received command
421 (RESCHEDULE), calling handler (reschedule_negotiator) 9/8 10:15:39 (pid:10205) Sent ad to central manager
for sjo@xxxxxxxxxx 9/8 10:15:39 (pid:10205) Sent ad to 1 collectors for
sjo@xxxxxxxxxx 9/8 10:15:39 (pid:10205) Called
reschedule_negotiator() 9/8 10:15:39 (pid:10205) failed to send RESCHEDULE
command to negotiator |