Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] 6.7.18 problem: Kerberos authentication issues post-upgrade

Date: Wed, 29 Mar 2006 17:04:23 +0100
From: David McBride <dwm@xxxxxxxxxxxx>
Subject: [Condor-users] 6.7.18 problem: Kerberos authentication issues post-upgrade

Hi,

I have just upgraded my local Condor pool to 6.7.18 (from 6.7.16) andI'm running into what look like some Kerberos authentication issues.


Scenario:
========
Every machine uses the same global configuration file:
http://www.doc.ic.ac.uk/condor/doc-config/condor_config.global
(Locally retrieved from an NFS volume.)

Note the strong-authentication section at the tail of the file; Allcondor daemons are required to authenticate using the local host keytabstored in /etc/krb5.keytab, and all WRITE operations must beauthenticated with Kerberos credentials.


Two machines of note:
skimmer.doc.ic.ac.uk acts as Condor master.
lightyear.doc.ic.ac.uk acts as a submit-only node.

Both machines are running a distributed derived from Mandrake 10.2 on alocally-built 2.6.13 kernel; the local Kerberos packages are derivedfrom MIT Kerberos 1.4.2:


# rpm -qa|grep krb
libkrb53-devel-1.4.2-0.1.102mdk
libkrbafs0-1.2.2-4mdk
libkrb53-1.4.2-0.1.102mdk
krb5-workstation-1.4.2-0.1.102mdk
libkrbafs0-devel-1.2.2-4mdk
ftp-client-krb5-1.4.2-0.1.102mdk
pam_krb5-2.1.8-1doc
telnet-client-krb5-1.4.2-0.1.102mdk

Failure case:
=============

User 'mwj' tries to submit a set of Condor jobs to the local schedd onlightyear. This is successful, as they have a local kerberos TGT.

The jobs, however, never start. Indeed, when running `condor_q -global`they do not appear at all, whereas they _are_ listed when queried using`condor_q` on lightyear itself. This suggests a communications issue ofsome kind.


Reviewing the MasterLog on Lightyear, the following errors were displayed:

==> MasterLog <==

3/29 12:57:19 AUTHENTICATE: no available authentication methodssucceeded, failing!3/29 12:57:19 DC_AUTHENTICATE: authenticate failed:AUTHENTICATE:1003:Failed to authenticate with anymethod|AUTHENTICATE:1004:Failed to authenticate using KERBEROS

3/29 12:57:23 AUTH_ERROR: Internal credentials cache error

3/29 12:57:23 AUTHENTICATE: no available authentication methodssucceeded, failing!3/29 12:57:23 ERROR: SECMAN:2004:Failed to start a session withTCP|AUTHENTICATE:1003:Failed to authenticate with anymethod|AUTHENTICATE:1004:Failed to authenticate using KERBEROS

3/29 12:58:23 getpeername failed so connect must have failed
3/29 12:58:43 Connect failed for 20 seconds; returning FALSE

3/29 12:58:43 ERROR: SECMAN:2003:TCP connection to <146.169.1.113:9618>failed


3/29 12:59:43 getpeername failed so connect must have failed
3/29 13:00:03 Connect failed for 20 seconds; returning FALSE

3/29 13:00:03 ERROR: SECMAN:2003:TCP connection to <146.169.1.113:9618>failed

The "Internal credentials cache error" appears to be the significantissue here; it looks like the Master daemon on Lightyear is unable tomutually-authenticate with the daemons on Skimmer as a result of thiscache problem, resulting in the observed communications breakdown.

Reconfiguring the logging to add D_SECURITY, the following fuller outputappears on Lightyear:


==> MasterLog <==

3/29 16:45:40 STARTCOMMAND: starting 2 to <146.169.1.113:9618> on UDPport 47686.

3/29 16:45:40 SECMAN: command 2 to <146.169.1.113:9618> on UDP port 47686.

3/29 16:45:40 SECMAN: command 60010 to <146.169.1.113:9618> on TCP port43363.

3/29 16:45:40 SECMAN: new session, doing initial authentication.
3/29 16:45:40 SECMAN: Auth methods: KERBEROS
3/29 16:45:40 HANDSHAKE: in handshake(my_methods = 'KERBEROS')
3/29 16:45:40 HANDSHAKE: handshake() - i am the client
3/29 16:45:40 HANDSHAKE: sending (methods == 64) to server
3/29 16:45:40 HANDSHAKE: server replied (method = 64)

3/29 16:45:40 KERBEROS: krb5_unparse_name:host/skimmer.doc.ic.ac.uk@xxxxxxxxxxxx

3/29 16:45:40 KERBEROS: no user yet determined, will grab up to slash
3/29 16:45:40 KERBEROS: picked user: host
3/29 16:45:40 KERBEROS: remapping 'host' to 'condor'
3/29 16:45:40 unable to open map file (null), errno 14
3/29 16:45:40 Client is condor@(null)

3/29 16:45:40 KERBEROS: Server principal ishost/skimmer.doc.ic.ac.uk@xxxxxxxxxxxx3/29 16:45:40 init_daemon: client principal is'host/lightyear.doc.ic.ac.uk@xxxxxxxxxxxx'

3/29 16:45:40 init_daemon: Using default keytab FILE:/etc/krb5.keytab
3/29 16:45:40 AUTH_ERROR: Internal credentials cache error
3/29 16:45:40 AUTHENTICATE: method 64 (KERBEROS) failed.
3/29 16:45:40 HANDSHAKE: in handshake(my_methods = '')
3/29 16:45:40 HANDSHAKE: handshake() - i am the client
3/29 16:45:40 HANDSHAKE: sending (methods == 0) to server
3/29 16:45:40 HANDSHAKE: server replied (method = 0)

3/29 16:45:40 AUTHENTICATE: no available authentication methodssucceeded, failing!

3/29 16:45:40 SECMAN: unable to start session via TCP, failing.

3/29 16:45:40 ERROR: SECMAN:2004:Failed to start a session withTCP|AUTHENTICATE:1003:Failed to authenticate with anymethod|AUTHENTICATE:1004:Failed to authenticate using KERBEROS

It looks like it either cannot determine its local identity properly(note the "Client is condor@(null)" entry) or it is unable to processthe local /etc/krb5.keytab file properly -- perhaps it is attempting todo so as the local 'condor' user, and not as root?


Any assistance with this issue would be greatly appreciated.

Cheers,
David
--
David McBride <dwm@xxxxxxxxxxxx>
Department of Computing, Imperial College, London

Follow-Ups:
- Re: [Condor-users] 6.7.18 problem: Kerberos authentication issues post-upgrade
  - From: David McBride

Prev by Date: Re: [Condor-users] Condor clients using virtual linux?
Next by Date: Re: [Condor-users] Preemption question
Previous by thread: [Condor-users] Jobs fail to start.
Next by thread: Re: [Condor-users] 6.7.18 problem: Kerberos authentication issues post-upgrade
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[Condor-users] 6.7.18 problem: Kerberos authentication issues post-upgrade