Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Starter does not recognize job script as executable when ACL is used to set access rights.
- Date: Mon, 25 Nov 2019 09:55:11 +0300 (MSK)
- From: "Sergey A. Komissarov" <sergey.komissarov@xxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Starter does not recognize job script as executable when ACL is used to set access rights.
Hello Zach,
user20000 belongs to the single group 'users' with group id 100. Group with id 1001 does not exist on the execute machine.
user20000@483d941bd1ee:/$ groups
users
user20000@483d941bd1ee:/$ cat /etc/passwd | grep user20000
user20000:x:20000:100::/home/user20000:/bin/false
user20000@483d941bd1ee:/$ cat /etc/group | grep 10001
user20000@483d941bd1ee:/$
----------
Sergey Komissarov
Senior Software Developer
DATADVANCE
This message may contain confidential information
constituting a trade secret of DATADVANCE. Any distribution,
use or copying of the information contained in this
message is ineligible except under the internal
regulations of DATADVANCE and may entail liability in
accordance with the current legislation of the Russian
Federation. If you have received this message by mistake
please immediately inform me of it. Thank you!
----- Original Message -----
From: "Zach Miller" <zmiller@xxxxxxxxxxx>
To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Cc: "Sergey A. Komissarov" <sergey.komissarov@xxxxxxxxxxxxxx>
Sent: Friday, November 22, 2019 9:54:30 PM
Subject: Re: [HTCondor-users] Starter does not recognize job script as executable when ACL is used to set access rights.
Hi Sergey,
When you log in to the execute machine as user2000, and run "groups" on the command line, what do you see?
I think what is happening is HTCondor is switching user ID but is not switching to 1001 group ID as you are expecting. My guess is user2000 belongs to multiple groups... let me know what the above command returns.
Cheers,
-zach
ïOn 11/22/19, 11:36 AM, "HTCondor-users on behalf of Sergey A. Komissarov via HTCondor-users" <htcondor-users-bounces@xxxxxxxxxxx on behalf of htcondor-users@xxxxxxxxxxx> wrote:
Hello,
We are using shared filesystem to prepare condor jobs and ACL to control user access rights.
The problem is that the workstation where job is prepared does not know anything about users on condor machines.
The job script is made under some user and group and set executable flag for user and group.
The job script has owner with uid 10131 and group 1001, and submitted to the condor with +Owner=user20000 option.
Startd log is the following:
11/22/19 13:17:09 (fd:19) (pid:56) (D_ALWAYS) Running job as user user20000
11/22/19 13:17:09 (fd:19) (pid:56) (D_ALWAYS) About to exec /shared/job-dir/start.sh
11/22/19 13:17:09 (fd:19) (pid:56) (D_PRIV) PRIV_USER --> PRIV_CONDOR at /slots/02/dir_19946/userdir/.tmpWrq8Vb/condor-8.9.2/src/condor_starter.V6.1/os_proc.cpp:568
11/22/19 13:17:09 (fd:19) (pid:56) (D_DAEMONCORE) In DaemonCore::Create_Process(/shared/job-dir/start.sh,...)
11/22/19 13:17:09 (fd:21) (pid:56) (D_PRIV) PRIV_CONDOR --> PRIV_USER at /slots/02/dir_19946/userdir/.tmpWrq8Vb/condor-8.9.2/src/condor_daemon_core.V6/daemon_core.cpp:7654
11/22/19 13:17:09 (fd:21) (pid:56) (D_ALWAYS) Create_Process: Cannot access specified executable "/shared/job-dir/start.sh": errno = 13 (Permission denied)
11/22/19 13:17:09 (fd:21) (pid:56) (D_PRIV) PRIV_USER --> PRIV_CONDOR at /slots/02/dir_19946/userdir/.tmpWrq8Vb/condor-8.9.2/src/condor_daemon_core.V6/daemon_core.cpp:7669
This is how job directory looks from the condor execute host after it is submitted and failed to start:
root@execute# ls -la /shared/job-dir/
total 12
drwxrwx---+ 2 10131 1001 4096 Nov 22 14:39 .
drwxr-xr-x 3 10131 1001 4096 Nov 22 14:49 ..
-rw-rw----+ 1 user20000 users 0 Nov 22 14:39 stdout
-rwxrwx---+ 1 10131 1001 1009 Nov 22 14:39 start.sh
-rw-rw----+ 1 user20000 users 0 Nov 22 14:39 stderr
root@execute# getfacl /shared/job-dir/start.sh
# file: shared/job-dir/start.sh
# owner: 10131
# group: 1001
user::rwx
user:user20000:rwx
group::---
mask::rwx
other::---
If I set 'chmod o+x' for the job script everything works. But It seems like a bug because when I login
to execute host under user20000 I can start job script without executable flag for the others.
We have HTCondor 8.9.2 running inside docker cluster, the host and the docker containers uses Ubuntu 16.04.1.
----------
Sergey Komissarov
Senior Software Developer
DATADVANCE
This message may contain confidential information
constituting a trade secret of DATADVANCE. Any distribution,
use or copying of the information contained in this
message is ineligible except under the internal
regulations of DATADVANCE and may entail liability in
accordance with the current legislation of the Russian
Federation. If you have received this message by mistake
please immediately inform me of it. Thank you!
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/