Hey All,
I've got two separate Active Directory domains with condor servers running on them.
When I submit from domain A, jobs are not running on domain B. I've already made sure the UID_DOMAIN is different so the jobs trying to run in domain B are trying to launch with
the condor user ('condor-reuse-slot1').
The error below (from StarterLog.slot1) shows the errors - relevant ones being these, I think:
10/29/11 09:51:43 LogonUser(condor-reuse-slot1, ... ) failed with status 1385
10/29/11 09:51:43 ERROR "Failed to create a user nobody" at line 482 in file c:\condor\execute\dir_4228\userdir\src\condor_utils\uids.cpp I've found that if I add condor-reuse-slot1 to the domain-admins group, my jobs run, so I'm pretty sure I'm one AD permissions issue away from success! Does anyone know the specific
permissions I need to add to the condor users to enable them to run jobs without having to keep them in the domain admins group? And how do I add this (if it's not obvious!). I've seen a few posts asking similar things, but not found any specific answer -
a few hacks which have worked in some situations, but nothing that felt 'right'.
Any ideas appreciated
Many thanks!
Rob
10/29/11 09:51:43 ******************************************************
10/29/11 09:51:43 ** condor_starter (CONDOR_STARTER) STARTING UP 10/29/11 09:51:43 ** C:\Condor\bin\condor_starter.exe 10/29/11 09:51:43 ** SubsystemInfo: name=STARTER type=STARTER(8) class=DAEMON(1) 10/29/11 09:51:43 ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON 10/29/11 09:51:43 ** $CondorVersion: 7.6.3 Aug 17 2011 BuildID: 361356 $ 10/29/11 09:51:43 ** $CondorPlatform: x86_winnt_5.1 $ 10/29/11 09:51:43 ** PID = 4724 10/29/11 09:51:43 ** Log last touched 10/29 08:51:42 10/29/11 09:51:43 ****************************************************** 10/29/11 09:51:43 Using config source: C:\Condor\condor_config 10/29/11 09:51:43 Using local config sources: 10/29/11 09:51:43 C:\Condor/condor_config.local 10/29/11 09:51:43 DaemonCore: command socket at <192.168.206.9:2965> 10/29/11 09:51:43 DaemonCore: private command socket at <192.168.206.9:2965> 10/29/11 09:51:43 Setting maximum accepts per cycle 4. 10/29/11 09:51:43 GLEXEC_JOB not supported on this platform; ignoring 10/29/11 09:51:43 Setting resource limits not implemented! 10/29/11 09:51:43 Communicating with shadow <192.9.201.133:4216> 10/29/11 09:51:43 Submitting machine is "pebble.hrw-uk.local" 10/29/11 09:51:43 setting the orig job name in starter 10/29/11 09:51:43 setting the orig job iwd in starter 10/29/11 09:51:43 LogonUser(condor-reuse-slot1, ... ) failed with status 1385 10/29/11 09:51:43 ERROR "Failed to create a user nobody" at line 482 in file c:\condor\execute\dir_4228\userdir\src\condor_utils\uids.cpp 10/29/11 09:51:43 ShutdownFast all jobs. 10/29/11 09:51:43 condor_read() failed: recv() returned -1, errno = 10054 , reading 5 bytes from <192.9.201.133:4222>. 10/29/11 09:51:43 IO: Failed to read packet header HR Wallingford uses faxes and emails for confidential and legally privileged business communications. They do not of themselves create legal commitments. Disclosure to parties other than addressees requires our
specific consent. We are not liable for unauthorised disclosures nor reliance upon them.
|