On Tue, Aug 01, 2006 at 04:21:41PM +0100, Dr Ian C. Smith wrote:
Hi,
I've recently been getting error reports from our condor view
server of the form
"/opt/condor/sbin/condor_collector" on "ulgp1.liv.ac.uk" died due to
signal 11.
Condor will automatically restart this process in 10 seconds.
*** Last 20 line(s) of file CollectorLog:
8/1 15:58:53 Error while removing ad
8/1 15:58:53 **** Removing stale ad: "< ulgbc1.liv.ac.uk ,
38.253.100.129
> "
8/1 15:58:53 Error while removing ad
8/1 15:58:53 **** Removing stale ad: "< ulgbc2.liv.ac.uk ,
38.253.100.82
> "
8/1 15:58:53 Error while removing ad
These machines aren't "real" execute hosts but represent clusters which
can be reached using Condor-G. The classads for them are generated by a
cron from
Globus MDS info. The classads are updated every 5 minutes. Things seemed
ok until I added StartdIpAddr attribute in the classads in response to
another error. Could this have something to do with it ?
Yes. Upgrade to 6.8.0, which fixes some bugs with the StartdIpAddr
handling in the collector.
-Erik
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR