[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor-C SetEffectiveOwner - Permission denied [SEC=UNCLASSIFIED]



Hi Troy,

Try this:

QUEUE_SUPER_USER_MAY_IMPERSONATE = .*

on the submitter.

Brian

On Feb 16, 2014, at 6:53 PM, Troy Robertson <Troy.Robertson@xxxxxxxxxx> wrote:

> Still following up on this problem,
> 
> Below is a few lines from the submit machines SchedLog which I am hoping might be enlightening to someone because I have no idea.
> With FULLDEBUG the line about Queue super user now stands out.
> I have uninstalled/reinstalled, new config files, removed directory, all to no avail.  
> Job submission works fine as a Personal Pool (Windows 7), and as Vanilla submit in a wider pool (all linux).  
> It is just under Condor-C with submitter configured as a Personal Pool and performing a grid job submission that this authentication problem crops up.
> 
> Why?
> 
> Troy
> 
> 
> SCHEDLOG
> 02/17/14 10:58:16 (pid:5504) Received TCP command 1111 (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34702>, access level READ
> 02/17/14 10:58:16 (pid:5504) Number of Active Workers 0
> 02/17/14 10:58:16 (pid:5504) QMGR Connection closed
> 02/17/14 10:58:17 (pid:5504) Received TCP command 1112 (QMGMT_WRITE_CMD) from SYSTEM <147.66.85.78:34708>, access level WRITE
> 02/17/14 10:58:17 (pid:5504) Queue super user not allowed to set owner to troy@domain, because this instance of the schedd has never seen that user submit any jobs.
> 02/17/14 10:58:17 (pid:5504) SetEffectiveOwner security violation: setting owner to troy@domain when active owner is "SYSTEM"
> 02/17/14 10:58:17 (pid:5504) condor_read(): Socket closed when trying to read 5 bytes from <147.66.85.78:34708>
> 02/17/14 10:58:17 (pid:5504) IO: EOF reading packet header
> 02/17/14 10:58:17 (pid:5504) QMGR Connection closed
> 02/17/14 10:58:17 (pid:5504) Received TCP command 1111 (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34715>, access level READ
> 02/17/14 10:58:17 (pid:5504) Number of Active Workers 0
> 02/17/14 10:58:17 (pid:5504) QMGR Connection closed
> 02/17/14 10:58:19 (pid:5504) Received TCP command 1111 (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34732>, access level READ
> 
> 
> ---------------------------------------------------------------------------------------------------
>> On Tue, 11 Feb 2014 14:40:15 Troy Robertson wrote:
>> 
>> Hi,
>> 
>> I'm struggling with HTCondor-C.
>> 
>> This was originally working on our system but during the 2 years I have
>> been away something failed and users reverted to using it as a single
>> pool.
>> Still running 7.8.8 across a number of dedicated linux processors with
>> Windows user submit machines.  I don't want to upgrade until I find the
>> answer to this issue.
>> 
>> If a grid job is submitted it sits locally Idle with: Request has not
>> been considered by the Matchmaker Gridmanager process keeps starting
>> up, repeatedly failing to set permissions for something? And then
>> exiting.  SchedLog shows something similar.
>> 
>> I have Googled my heart out to no avail.  Have re-installed at Windows
>> submit machine.  What is it about the uid's/permissions?
>> As I said, jobs submitted as Vanilla rather than Grid to the same
>> remote central manager run as per normal.  The Gahp_worker never fires,
>> so I think it is a problem locally.
>> 
>> Can anyone please be of assistance.
>> 
>> Troy
>> 
>> 
>> GridmanagerLog:
>> ...
>> 02/11/14 14:23:40 [7608] TokenCache contents:
>> troy@domain
>> 02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent()
>> 02/11/14 14:23:40 [7608] DaemonCore::IsPidAlive(): OpenProcess failed
>> 02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent() - ppid
>> 4740l disappeared!
>> 02/11/14 14:23:40 [7608] Checking proxies
>> 02/11/14 14:23:43 [7608] Initialized the following authorization table:
>> 02/11/14 14:23:43 [7608] Authorizations yet to be resolved:
>> 02/11/14 14:23:43 [7608] allow READ:  */* */*
>> 02/11/14 14:23:43 [7608] allow WRITE:  */* */local@xxxxx */147.66.85.62
>> */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow NEGOTIATOR:  */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow ADMINISTRATOR:  */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow OWNER:  */ local@xxxxx */NEW-
>> 50985.aad.gov.au */147.66.85.62 */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow DAEMON:  */* */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow ADVERTISE_STARTD:  */* */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow ADVERTISE_SCHEDD:  */* */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] allow ADVERTISE_MASTER:  */* */ local@xxxxx
>> */147.66.85.62 */147.66.85.62
>> 02/11/14 14:23:43 [7608] Received ADD_JOBS signal
>> 02/11/14 14:23:43 [7608] in doContactSchedd()
>> 02/11/14 14:23:43 [7608] TokenCache contents:
>> troy@domain
>> 02/11/14 14:23:43 [7608] SetEffectiveOwner(troy@domain) failed with
>> errno=13: Permission denied.
>> 02/11/14 14:23:43 [7608] Failed to connect to schedd! Will retry
>> 02/11/14 14:23:45 [7608] Evaluating staleness of remote job statuses.
>> 02/11/14 14:23:48 [7608] in doContactSchedd()
>> 02/11/14 14:23:48 [7608] TokenCache contents:
>> troy@domain
>> ...[SNIP]...
>> 02/11/14 14:24:23 [7608] SetEffectiveOwner(troy@domain) failed with
>> errno=13: Permission denied.
>> 02/11/14 14:24:23 [7608] Failed to connect to schedd! Will retry
>> 02/11/14 14:24:28 [7608] in doContactSchedd()
>> 02/11/14 14:24:28 [7608] TokenCache contents:
>> troy@domain
>> 02/11/14 14:24:28 [7608] SetEffectiveOwner(troy@domain) failed with
>> errno=13: Permission denied.
>> 02/11/14 14:24:28 [7608] Failed to connect to schedd!
>> 02/11/14 14:24:28 [7608] ERROR "Too many failures connecting to
>> schedd!" at line 1246 in file
>> c:\condor\execute\dir_11160\userdir\src\condor_gridmanager\gridmanager.
>> cpp
>> 02/11/14 14:28:40 init_user_ids: want user 'troy@domain', current is
>> '(null)@(null)'
>> 02/11/14 14:28:40 Found credential for user troy@domain'
>> 02/11/14 14:28:40 LogonUser completed.
>> 
>> SchedLog:
>> ...
>> 02/11/14 13:36:11 (pid:4740) SetEffectiveOwner security violation:
>> setting owner to troy@domain when active owner is "SYSTEM"
>> 02/11/14 13:36:12 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:14 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:15 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:16 (pid:4740) SetEffectiveOwner security violation:
>> setting owner to troy@domain when active owner is "SYSTEM"
>> 02/11/14 13:36:17 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:18 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:20 (pid:4740) Number of Active Workers 0
>> 02/11/14 13:36:21 (pid:4740) SetEffectiveOwner security violation:
>> setting owner to troy@domain when active owner is "SYSTEM"
>> 02/11/14 13:36:21 (pid:4740) condor_gridmanager (PID 7652, owner troy)
>> exited with return code 4.
>> 02/11/14 13:36:21 (pid:4740) Number of Active Workers 0
>> 
>> 
>> Condor_config.local:
>> ...
>> UID_DOMAIN = $(FULL_HOSTNAME)
>> 
>> #TRUST_UID_DOMAIN = TRUE
>> 
>> HOSTALLOW_READ = *
>> HOSTALLOW_WRITE = *
>> 
>> ##  Daemons
>> DAEMON_LIST=MASTER SCHEDD COLLECTOR NEGOTIATOR
>> 
>> ## GRID PARAMS
>> GRIDMANAGER_LOG = $(LOG)/GridLogs/GridmanagerLog.$(USERNAME)
>> C_GAHP_LOG = $(LOG)/GridLogs/CGAHPLog.$(USERNAME)
>> C_GAHP_WORKER_THREAD_LOG = $(LOG)/GridLogs/CGAHPWorkerLog.$(USERNAME)
>> 
>> ## DEBUGGING
>> GRIDMANAGER_DEBUG          = D_FULLDEBUG
>> C_GAHP_DEBUG = D_FULLDEBUG
>> C_GAHP_WORKER_THREAD_DEBUG = D_FULLDEBUG
>> 
>> ## Security
>> SEC_DEFAULT_NEGOTIATION = OPTIONAL
>> SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
>> 
>> 
>> Submit file:
>> Universe = grid
>> Executable = /usr/bin/R
>> transfer_executable = False
>> Arguments = --version
>> Error = Error_$(Cluster).$(Process).txt
>> Output = Output_$(Cluster).$(Process).txt Log = Condor_log.txt
>> should_transfer_files = ALWAYS 
>> when_to_transfer_output = ON_EXIT
>> grid_resource = condor server1.a.b.c server1.a.b.c
>> +remote_requirements = Arch == "X86_64" && OpSys == "LINUX"
>> +remote_universe = vanilla
>> +remote_shouldtransferfiles = "YES"
>> +remote_whentotransferoutput = "ON_EXIT"
>> Queue
>> 
> 
> ___________________________________________________________________________
> 
>    Australian Antarctic Division - Commonwealth of Australia
> IMPORTANT: This transmission is intended for the addressee only. If you are not the
> intended recipient, you are notified that use or dissemination of this communication is
> strictly prohibited by Commonwealth law. If you have received this transmission in error,
> please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and
> DELETE the message.
>        Visit our web site at http://www.antarctica.gov.au/
> ___________________________________________________________________________
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/