[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Help with setting up a new cluster



The installation went ok and all the services are running and healthy, but
the two machines cannot communicate with each other.
	What did you do to test this?  By default, condor_reconfig and 
condor_restart only communicate with the local daemons.
I also suspect something is very wrong since neither sudo condor_reconfig or sudo condor_restart works on either machine.
	This indicates a local authentication or authorization problem, 
but both of those are tricky. :)
There are no firewalls enabled on either machine. (I can ssh for example between the machines)
	For youre reference, by default, HTCondor only needs port 9618 to 
be open inbound.
volcano@volcano:~$ sudo condor_restart
ERROR
SECMAN:2010:Received "DENIED" from server for user condor_pool@ using
method IDTOKENS.
Can't send Restart command to local master
	Looking at your condor_config.central, and what the error is 
saying, it looks like you took the advice "## To expand your condor pool 
beyond a single host, set ALLOW_WRITE to match all of the hosts" without
following the instructions in the preceeding (admittedly very long) 
comment.  If you're very confident in the security of your network, your
current config will probably work if you follow enable host-based 
security; the comment has specific instructions.
	It is more secure to configure HTCondor with user-based security, 
which is why it is the default.  If you look at the error message above,
it specifies the user you authenticated as (condor_pool@).  I'm pretty
sure condor_restart requires ADMINISTRATOR access, so you should set

ALLOW_ADMINISTRATOR = $(ALLOW_ADMINISTRATOR) condor_pool@

In fact, you should probably unset all of the other ALLOW_* values
you set; they should all already be set correctly as a result of
running get_htcondor.

kenway@haleakala:/etc/condor$ sudo condor_restart
[sudo] password for kenway:
ERROR
SECMAN:2010:Received "DENIED" from server for user condor_pool@ using
method IDTOKENS.
Can't send Restart command to local master
	It looks like you were consistent (yay!) between the different 
nodes' security configurations, so the same advice applies here.  That
should get the administrative commands working.

The next step is probably checking condor_status, to see if the EPs can report to the CM.
- ToddM