[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Mixed-version condor cluster of LTS 9 / 10 works?



Hi Chun-Yu,

I believe the reason for this failure is due to the change of the default configuration value for TRUST_DOMAIN in V10_0_0 of HTCondor. It changed from a default of $(COLLECTOR_HOST) to $(UID_DOMAIN). TRUST_DOMAIN needs to be the same value for all machines running condor in a pool. If you have issued IDTokens already that you want to keep, then you can set TRUST_DOMAIN to the value of the issuer field from condor_token_list commmands output. Otherwise, If you have no issued IDTokens or want to start fresh then you can set TRUST_DOMAIN to what you want. If you chose to stick with using default values, I recommend using the v10 default of TRUST_DOMAIN=$(UID_DOMAIN).

Cheers,
Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Chun-Yu Lin <1203036@xxxxxxxxxxxxxx>
Sent: Thursday, February 23, 2023 3:04 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Mixed-version condor cluster of LTS 9 / 10 works?
 
Dear all,
     Is the LTS major release 9 and 10 interchangeable for each element in a Condor cluster?

I had a single, working configuration (with security:recommended_v9_0) for a pure v9 or v10 cluster.
Now, if the collector/negotiator/schedd is 9.0.16 and startd is 10.0.1, or the other way around, startd log gives: 

02/23/23 16:13:16 Collector update failed; will try to get a token request for trust domain grid.nchc.org.tw, identity (default).
02/23/23 16:13:16 Failed to start non-blocking update to <10.200.6.19:9618>.
02/23/23 16:13:16 TOKEN: No token found.
02/23/23 16:13:16 AUTH_ERROR: Cannot resolve network address for KDC in requested realm
02/23/23 16:13:16 SECMAN: required authentication with collector 10.200.6.19 failed, so aborting command DC_START_TOKEN_REQUEST.
02/23/23 16:13:16 Failed to request a new token: DAEMON:1:failed to start command for token request with remote daemon at '<10.200.6.19:9618>'.|AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using SCITOKENS|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate using IDTOKENS|AUTHENTICATE:1004:Failed to authenticate using FS

Is there an extra config needed for a mixed-version condor ?

Many thanks,
Chun-yu