Hi all, I've a 2nd submit machine with CONDOR_HOST = 1st-submit-machine DAEMON_LIST = MASTER, SCHEDD Not sure when this started but running condor_q on it now takes 20+ seconds and returns -- Failed to fetch ads from: <my.ip:31286> : 2nd-submit-machine SECMAN:2007:Failed to end classad message. SchedLog is full of 06/18/13 15:52:59 (pid:XXXXX) Send_Signal: Warning: could not send signal 71005 (UPDATE_JOBAD) to pid YYYYY (still alive) (YYYY pids change), the rest of the logs look OK. The first submit machine is centos 5 w/ OSG RPM of condor-7.8.8 (it sends jobs to OSG). The 2nd one is centos 6 w/ condor-8 RPM from condor repo. Any hints on what the error's about and how to fix it? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature