[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Credd discovery in 24.0



Hi,

 

We have been playing with HTCondor 24.0.2 Central Managers a little bit and have found a few issues. I am going to send three separate emails for them.

 

First issue is that submissions from 23.0 clients do not work with 24.0 CMs for us. The reason is that the client cannot properly discover the credd daemon address, which is necessary in our case to handle the user Kerberos credentials. Clients running v23.10 or v24.0 do work properly, so if this just to be expected we will need to assume that we need to update clients before CMs.

 

By increasing debugging, we have observed that the query for the CredD address is the same in every case, e.g.:

 

12/13/24 09:51:20 (fd:4) (pid:532356) (D_HOSTNAME) Querying collector <137.138.149.92:9618?alias=sleepybird01.cern.ch> (sleepybird01.cern.ch) with classad:

LimitResults = 1

LocationQuery = "babybird01.cern.ch"

MyType = "Query"

Projection = "CondorVersion CondorPlatform MyAddress AddressV1 Name Machine _condor_PrivRemoteAdminCapability"

Requirements = ((Name == "babybird01.cern.ch"))

TargetType = "CredD"

12/13/24 09:51:20 (fd:4) (pid:532356) (D_HOSTNAME)  --- End of Query ClassAd ---

 

However, there is a difference between what the clients make out of it.

 

  • 23.10/24.0 clients:

 

12/13/24 09:51:20 (fd:4) (pid:532356) (D_HOSTNAME) Daemon client (credd) address determined: name: "babybird01.cern.ch", pool: "", alias: "babybird01.cern.ch", addr: "<188.185.87.2:9618?addrs=188.185.87.2-9618+[2001-1458-d00-3f--100-54d]-9618&alias=babybird01.cern.ch&noUDP&sock=credd_2325683_d086>"

 

  • 23.0 clients:

 

12/13/24 09:46:01 (fd:5) (pid:3737234) (D_HOSTNAME) Daemon client (credd) address determined: name: "babybird01.cern.ch", pool: "NULL", alias: "babybird01.cern.ch", addr: "<188.185.87.2:9618?addrs=188.185.87.2-9618+[2001-1458-d00-3f--100-54d]-9618&alias=babybird01.cern.ch&noUDP&sock=schedd_2717527_10f7>"

 

Notice that in the latter case, the 'sock=schedd_2717527_10f7' bit seems wrong (should be 'sock=credd_...'), and this leads to:

 

12/13/24 09:46:01 (fd:5) (pid:3737234) (D_ALWAYS) STORE_CRED: Failed to start STORE_CRED command. Unable to contact credd babybird01.cern.ch

 

I have also noticed that if a query a CM using condor_status, v23.0 CMs will provide the credd classad when asked with either '-subsystem credd' or '-subsystem CredD', but v24.0 CMs will only provide the classad if '-subsystem CredD' is used. This happens with either 23.0 or 24.0 clients. This may be related to the described issue or not.

 

So, is this to be expected and we should just assume it?

 

Thank you.

 

Cheers,

   Antonio