[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor high availability



Do I need to configure any other authentication methods in addition to all servers using LDAP via PAM ?
	Yes, of course.  Security between different nodes has nothing to 
with how users log in.
I tried to set the variable as you suggested, to no avail. Master2 now says
it can't connect to master1 ("Failed to fetch ads")
	From your description, master1 is the original "master" node.  I 
don't know if HAD will work for machines that are both submit nodes and 
central managers, but for now let's assume that it will.  Note that 
HA instructions do NOT address security at all; that's deliberate, because 
security is complicated and nothing in HA changes anything about how your
security should work, except the addition of another server.  It's a bit 
more of surprise to you, perhaps, because you didn't separate your central 
manager from your submit server (and thus FS worked for all your 
client-to-daemon connections).
	From your serverfault question, it looks like you basically don't 
have any security at all -- your ALLOW lists include *, so the problem 
must be in authentication, not authorization.
	Note that condor_q, by default in recent HTCondor versions, 
requires authentication so that it only returns the jobs of the user who 
ran the command.  Try running 'condor_q -all-users'; I think that will use 
a different command that doesn't require authentication.
	For this purpose, given that you know that the two masters share a 
filesystem and user IDs, REMOTE_FS is not a bad choice.  You'll need to 
set SEC_DEFAULT_AUTHENTICATION_METHODS on master1 and master2 to include 
FS and REMOTE_FS; I would remove KERBEROS (since you're not using it). 
Both master1 and master2 need to set FS_REMOTE_DIR to the same value.  Be 
sure to restart HTCondor on both machines after you've done that (I can't 
keep straight which configuration changes only require a reconfig).  Try 
running condor_q again; it should work.  If it doesn't, try running
_CONDOR_TOOL_DEBUG=D_FULLDEBUG condor_q -debug

and we'll see what we can see.

- ToddM