[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Authentication error after upgrade to 10.0



Adding LEGACY_ALLOW_SEMANTICS = TRUE doesn't solve the problem.
Trying to drain a node using:
cm ~]# condor_drain -graceful tech-wn001 ÂÂÂÂÂÂÂÂÂ
Attempt to send DRAIN_JOBS to startd <192.114.101.1:9618?addrs=192.114.101.1-9618&alias=tech-wn001.hep.technion.ac.il&noUDP&sock=startd_2694_703a> failed
Failed to start DRAIN_JOBS command to slot1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The worker node seems to look only for GSI
2/04/22 10:34:41 DC_AUTHENTICATE: required authentication of CM_IP failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using GSI|GSI:5003:Failed to authenticate. Globus is
reporting error (851968:254). There is probably a problem with your credentials. Â(Did you run grid-proxy-init?)|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable t
o lstat(/tmp/FS_XXXZpyt30)

Looking at the DAEMON nobs at both the CM and the startd:
cm ~]# sudo grep -R DAEMON /etc/condor/*
/etc/condor/config.d/50-security:SEC_DAEMON_AUTHENTICATION = REQUIRED
/etc/condor/config.d/50-security:SEC_DAEMON_INTEGRITY = REQUIRED
/etc/condor/config.d/50-security:SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
/etc/condor/config.d/50-security:ALLOW_DAEMON = condor_pool@*/*, condor@*/$(IP_ADDRESS)

cm ~]# condor_config_val -dump | grep DAEMON ÂÂÂÂÂ
ALLOW_DAEMON = condor_pool@*/*, condor@*/$(IP_ADDRESS)
AUTO_INCLUDE_CREDD_IN_DAEMON_LIST = false
AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST = true
DAEMON_LIST = MASTER COLLECTOR NEGOTIATOR
DAEMON_SOCKET_DIR = auto
DC_DAEMON_LIST = Â
GSI_DAEMON_CERT = Â
GSI_DAEMON_DIRECTORY = Â
GSI_DAEMON_KEY = Â
GSI_DAEMON_NAME = Â
GSI_DAEMON_PROXY = Â
GSI_DAEMON_TRUSTED_CA_DIR = Â
MASTER_DAEMON_AD_FILE = Â
SCHEDD_DAEMON_AD_FILE = $(SPOOL)/.schedd_classad
SEC_DAEMON_AUTHENTICATION = REQUIRED
SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
SEC_DAEMON_INTEGRITY = REQUIRED
SHARED_PORT_DAEMON_AD_FILE = $(LOCK)/shared_port_ad
START_DAEMONS =


wn001:~$ sudo grep -R DAEMON /etc/condor/* ÂÂÂÂÂÂÂÂÂ
/etc/condor/config.d/50-security:SEC_DAEMON_AUTHENTICATION = REQUIRED
/etc/condor/config.d/50-security:SEC_DAEMON_INTEGRITY = REQUIRED
/etc/condor/config.d/50-security:SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
/etc/condor/config.d/50-security:ALLOW_DAEMON = condor_pool@*/*, condor@*/$(IP_ADDRESS)

wn001:~$ condor_config_val -dump | grep DAEMON
ALLOW_DAEMON = condor_pool@*/*, condor@*/$(IP_ADDRESS)
AUTO_INCLUDE_CREDD_IN_DAEMON_LIST = false
AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST = true
DAEMON_LIST = MASTER, STARTD
DAEMON_SOCKET_DIR = auto
DC_DAEMON_LIST = Â
GSI_DAEMON_CERT = Â
GSI_DAEMON_DIRECTORY = Â
GSI_DAEMON_KEY = Â
GSI_DAEMON_NAME = Â
GSI_DAEMON_PROXY = Â
GSI_DAEMON_TRUSTED_CA_DIR = Â
MASTER_DAEMON_AD_FILE = Â
SCHEDD_DAEMON_AD_FILE = $(SPOOL)/.schedd_classad
SEC_DAEMON_AUTHENTICATION = REQUIRED
SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
SEC_DAEMON_INTEGRITY = REQUIRED
SHARED_PORT_DAEMON_AD_FILE = $(LOCK)/shared_port_ad
START_DAEMONS =



On Wed, Nov 23, 2022 at 6:47 PM John M Knoeller via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

As part of the 9.0 security changes, some of the authorization levels for commands changed. This is mostly because several things changed from ALLOW_WRITE to ALLOW_DAEMON and some of the ALLOW_ configurations used to inherit from ALLOW_WRITE and no longer do that.

Â

You can put back the old ALLOW_* inheritance rules by adding

Â

LEGACY_ALLOW_SEMANTICS = TRUE

Â

to your configuration.Â

Â

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of David Cohen
Sent: Wednesday, November 23, 2022 7:44 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Authentication error after upgrade to 10.0

Â

Hi,
Our cluster was configured to use the PASSWORD authentication, when originally installed condor 8.8.
After the upgrade to 10.0.16 I commented out "use security:recommended_v9_0" and most of the functionality was restored. But reconfiguring the services without restarting fails both on the CE and the worker nodes. Sending a drain command from the central manager also fails.

condor-ce]# condor_ce_reconfig
ERROR
SECMAN:2010:Received "DENIED" from server for user condor@xxxxxxxxxxxxxxxxxx using method FS.
Can't send Reconfig command to local master

wn]# condor_reconfig
ERROR SECMAN:2010:Received "DENIED" from server for user unauthenticated@unmapped using no authentication method, which may imply host-based security. Our address was 'IPADDR', and server's address was 'IPADDR'. Check your ALLOW settings and IP protocols.
Can't send Reconfig command to local master

cm ~]# condor_drain -graceful tau-wn01.hep.tau.ac.il     Â
Attempt to send DRAIN_JOBS to startd <IPADDR:9618?addrs=IPADDR-9618&alias=wn01.domain&noUDP&sock=startd_5918_d590> failed
Failed to start DRAIN_JOBS command to slot1_29@xxxxxxxxxxx

How can I restore that functionality and can I do it without dropping all running jobs on the nodes?

Thanks,
David

Â

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/