Re: [HTCondor-users] condor_off -peaceful -daemon master permissions check fail (BUG?)

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

On Jun 13, 2013, at 6:25 AM, "Joan J. Piles" <jpiles@xxxxxxxxx> wrote:

Hi all,

I don't know if this is a bug (I think it is), but there is a problem when you try to do a condor_off -peaceful -daemon master node from a central management machine.

When the condor master gets the peaceful shutdown command, it gets it from an authorized (as ADMINISTRATOR) machine. However, when it is to propagate this command to the children daemons, it does so as the local machine, which is not in the HOSTALLOW_ADMINISTRATOR list. We can see it in the log (172.16.4.103 is our management node, and 172.16.6.2 our test node):

MasterLog (trimmed, only relevant lines):

06/13/13 13:14:08 Received TCP command 60015 (DC_OFF_PEACEFUL) from unauthenticated@unmapped <172.16.4.103:46020>, access level ADMINISTRATOR
06/13/13 13:14:08 Calling HandleReq <handle_off_peaceful()> (0) for command 60015 (DC_OFF_PEACEFUL) from unauthenticated@unmapped <172.16.4.103:46020>
06/13/13 13:14:08 Got SIGTERM. Performing graceful shutdown.
06/13/13 13:14:08 Completed DC_SET_PEACEFUL_SHUTDOWN to local startd
06/13/13 13:14:14 Sent SIGTERM to STARTD (pid 31817)
06/13/13 13:14:14 The STARTD (pid 31817) exited with status 0
06/13/13 13:14:15 All daemons are gone. Exiting.

Here, we see that the request comes from an authorized source. However, what the startd sees is subtly different, as the order is seen as coming from the local machine, which is not authorized:

StartLog:

06/13/13 13:14:08 Calling Handler <DaemonCommandProtocol::WaitForSocketData> (2)
06/13/13 13:14:08 PERMISSION DENIED to unauthenticated@unmapped from host 172.16.6.2 for command 60016 (DC_SET_PEACEFUL_SHUTDOWN), access level ADMINISTRATOR: reason: ADMINISTRATOR authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 172.16.6.2,her06-02.hermes.cps.unizar.es,her06-02, hostname size = 2, original ip address = 172.16.6.2

As it later gets the sigterm:

06/13/13 13:14:14 Got SIGTERM. Performing graceful shutdown.
06/13/13 13:14:14 shutdown graceful
06/13/13 13:14:14 All resources are free, exiting.

The end result is that we get a graceful shutdown instead of the peaceful one we asked for.

An obvious workaround is to change:

HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST)

to:

HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST), $(FULL_HOSTNAME)

But since it's not the default policy, nor there is a clear reason why this should be so, I think it's more of a bug. condor_master should somehow authenticate as DAEMON, or pass on the credentials to startd.

When we do a condor_off -peaceful -daemon stard, however, everything works as expected since the shutdown command comes directly from the management machine.

Regards,

Joan
-- 
--------------------------------------------------------------------------
Joan Josep Piles Contreras -  Analista de sistemas
I3A - Instituto de Investigación en Ingeniería de Aragón
Tel: 876 55 51 47 (ext. 845147)
http://i3a.unizar.es -- jpiles@xxxxxxxxx
--------------------------------------------------------------------------
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Mailing List Archives

Authenticated access

Re: [HTCondor-users] condor_off -peaceful -daemon master permissions check fail (BUG?)