Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] 6.6.0 upgrade
- Date: Tue, 13 Jan 2004 11:24:43 -0500
- From: Robert Krzaczek <krz@xxxxxxxxxxx>
- Subject: Re: [condor-users] 6.6.0 upgrade
On Tuesday, Jan 13, 2004, at 11:14 America/New_York, Mike Smorul wrote:
We've tried installing 6.6.0 on several RedHat 9 clients. Our central
manager is running 6.4.7 for now. After running for a few hours, all
the
6.6.0 hosts will disappear on the central manager even though all the
daemons are running fine.
We see a very similar situation on a mix of Sparc Solaris systems under
Condor 6.6.0. In our situation, we see the machines disappear one hour
after bringing the flock online. The central manager is a Solaris 8
MU6 system, the rest of the flock are a mix of Solaris 8 and 9.
The only messages about their disappearance is in the CollectorLog
I found other errors in our log files; for example, the central
manager's negotiator reported
---------- Started Negotiation Cycle ----------
Phase 1: Obtaining ads from collector ...
Getting all public ads ...
Couldn't fetch ads: communication error
Aborting negotiation cycle
I plan on re-upgrading our flock to 6.6.0 (it's currently downgraded
back to 6.4.7) to check into this further. As I'm reviewing the logs I
captured from our brief 6.6.0 run, I'm seeing other messages that might
be red herrings, or might be further symptoms ("DC_AUTHENTICATE attempt
to open invalid session..."); I want to see if those errors happen
before or at the one hour failure point.
\bob
--
\def\bob{Bob Krzaczek, RIT Center for Imaging Science, krz@xxxxxxxxxxx}
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>