This is for the RH9 "** $CondorVersion: 6.6.11 Mar 23 2006 $" build running on Fedora Core 4. Interestingly, we weren't seeing this for tje RH7 "** $CondorVersion: 6.6.9 Mar 10 2005 $" build running on Fedora Core 4. From the MasterLog: 11/22 14:38:32 Started DaemonCore process "/usr/local/condor/sbin/condor_collector", pid and pgroup = 4480 11/22 14:38:32 enter Daemons::UpdateCollector 11/22 14:38:32 Attempting to send update via UDP to collector XXX.corefa.com <XXX.XXX.XXX.XXX:9618> 11/22 14:38:32 Can't connect to <XXX.XXX.XXX.XXX:9618>:0, errno = 111 11/22 14:38:32 Will keep trying for 10 seconds... 11/22 14:38:42 Connect failed for 10 seconds; returning FALSE 11/22 14:38:42 ERROR: SECMAN:2003:TCP connection to <XXX.XXX.XXX.XXX:9618> failed 11/22 14:38:42 Can't send UPDATE_MASTER_AD to collector XXX.corefa.com <XXX.XXX.XXX.XXX:9618>: Failed to send UDP update command to collector 11/22 14:38:42 DaemonCore: No more children processes to reap. 11/22 14:38:42 start recover timer (63) 11/22 14:38:42 Started DaemonCore process "/usr/local/condor/sbin/condor_schedd", pid and pgroup = 4485 11/22 14:38:42 enter Daemons::UpdateCollector 11/22 14:38:42 Attempting to send update via UDP to collector XXX.corefa.com <XXX.XXX.XXX.XXX:9618> 11/22 14:38:42 Can't connect to <XXX.XXX.XXX.XXX:9618>:0, errno = 111 11/22 14:38:42 Will keep trying for 10 seconds... 11/22 14:38:52 Connect failed for 10 seconds; returning FALSE 11/22 14:38:52 ERROR: SECMAN:2003:TCP connection to <XXX.XXX.XXX.XXX:9618> failed 11/22 14:38:52 Can't send UPDATE_MASTER_AD to collector XXX.corefa.com <XXX.XXX.XXX.XXX:9618>: Failed to send UDP update command to collector 11/22 14:38:52 The COLLECTOR (pid 4480) exited with status 127 11/22 14:38:52 ProcAPI::buildFamily failed: parent 4480 not found on system. 11/22 14:38:52 restarting /usr/local/condor/sbin/condor_collector in 265 seconds This looks to be similar to: https://www-auth.cs.wisc.edu/lists/condor-users/2006-May/msg00289.shtml Any ideas on how I can diagnose this further? Some other observations: * I can launch condor_collector from the command-line, and THEN it writes to the log. But the process doesn't actually do any collecting * If I hammer 'ps' to see when Condor is launching condor_collector, the launched process is owned by 'root', not 'condor' _______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at either https://lists.cs.wisc.edu/archive/condor-users/ http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR