I tried job submission again after manually adding user hit023 to the local admin group on the execute node. Still get the same error. 03/04/22 13:18:51 setting the orig job name in starter 03/04/22 13:18:51 setting the orig job iwd in starter 03/04/22 13:18:51 STORE_CRED: In mode 'add' 03/04/22 13:18:51 Encrypting execute directory "C:\PROGRA~1\condor\execute\dir_25952" to user hit023 03/04/22 13:18:51 Chirp config summary: IO false, Updates false, Delayed updates true. 03/04/22 13:18:51 IOProxy: couldn't write to C:\PROGRA~1\condor\execute\dir_25952\.chirp.config: Permission denied 03/04/22 13:18:51 Couldn't initialize IO Proxy. I’m now wondering if condor is looking at “user” execute_machine\hit023 and not ourdomain\hit023? Cheers Greg From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx>
On Behalf Of Hitchen, Greg (IM&T, Kensington WA) Hi All We have recently gone from testing into production using a pool_password for authentication, having a credential server credd running, and getting users to run_as_owner. This all works fine (after a couple of unexpected gotchas). I have now gone back to looking again at encrypt_execute_directory (on the submit side). We had previously tested this before enabling run_as_owner and things worked fine so long as you allow for non-windows fileservers that store input and output data, and use the cipher command before uploading output. This testing was done before the run_as_owner option was in production. It was tested though, but using test VM execute nodes that I created that had me in the admin group. Without run_as_owner the StarterLog.slot1 log file has entries like: 03/03/22 09:59:50 setting the orig job name in starter 03/03/22 09:59:50 setting the orig job iwd in starter 03/03/22 09:59:50 Encrypting execute directory "C:\PROGRA~1\condor\execute\dir_13488" to user condor-slot1 03/03/22 09:59:50 Loaded Registry hives for condor-slot1 03/03/22 09:59:50 Chirp config summary: IO false, Updates false, Delayed updates true. 03/03/22 09:59:50 Initialized IO Proxy. 03/03/22 09:59:50 Setting resource limits not implemented! 03/03/22 09:59:50 File transfer completed successfully. 03/03/22 09:59:51 Job 251.7 set to execute immediately 03/03/22 09:59:51 Starting a VANILLA universe job with ID: 251.7 03/03/22 09:59:51 Tracking process family by login "condor-slot1" With run_as_owner the entries show: 03/03/22 19:02:49 setting the orig job name in starter 03/03/22 19:02:49 setting the orig job iwd in starter 03/03/22 19:02:49 Encrypting execute directory "C:\PROGRA~1\condor\execute\dir_12664" to user hit023 03/03/22 19:02:49 Chirp config summary: IO false, Updates false, Delayed updates true. 03/03/22 19:02:49 IOProxy: couldn't write to C:\PROGRA~1\condor\execute\dir_12664\.chirp.config: Permission denied 03/03/22 19:02:49 Couldn't initialize IO Proxy. 03/03/22 19:02:49 Setting resource limits not implemented! 03/03/22 19:02:49 get_file(): Failed to open file C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe, errno = 13: Permission denied. 03/03/22 19:02:49 get_file(): consumed 1803 bytes of file transmission 03/03/22 19:02:49 DoDownload: consuming rest of transfer and failing after encountering the following error: STARTER at 152.83.115.17 failed to write to file C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe: (errno 13) Permission denied 03/03/22 19:02:49 Failed to set execute bit on C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe, errno=2 (No such file or directory) 03/03/22 19:02:49 File transfer failed (status=0). 03/03/22 19:02:49 ERROR "Failed to transfer files" at line 2468 in file D:\execute\dir_10492\sources\src\condor_starter.V6.1\jic_shadow.cpp 03/03/22 19:02:49 ShutdownFast all jobs. 03/03/22 19:02:49 Failed to open '.update.ad' to read update ad: No such file or directory (2). 03/03/22 19:02:49 condor_read(): Socket closed abnormally when trying to read 21 bytes from <152.83.3.21:61271>, errno=10054 In production though the execute nodes (all in a single AD domain) have in their “users” group “ourdomain\Domain Users” which includes all our HTCondor users. The allow permissions on the condor “execute” folder on the execute nodes are: Read & execute List folder contents Read There is no allow for: Full control Modify Write For testing I manually added Full control, Modify, and Write permissions on a single execute node but the errors are the same. SYSTEM on the execute node has full control of the execute folder as well. Thanks for any info/insights/suggestions/comments. Cheers Greg |