Hello, Maarten.
The core of the problem seems to be that FS authentication is not working properly and the user is authenticated as “condor@xxxxxxx”.
Could you please check the condor_ping information as user alicesgm?
----
condor_ping -debug -verbose -type schedd WRITE
....
Authenticated using: FS
All authentication methods: TOKEN,FS
....
-------
First, check the mount option and permissions information sharing in the /tmp directory, it may be that the alicesgm account is unable to write to /tmp or SELinux issue.
If you suspect SELinux, check the information below to see if you missed anything.
[root@ui20 tmp]# semanage permissive -l
Builtin Permissive Types
condor_negotiator_t
condor_master_t
condor_collector_t
condor_procd_t
condor_startd_t
condor_schedd_t
As I know, absence of condor_schedd_t can cause SELinux to fail because actions not registered with permissive can be blocked.
Also, could you check that the account have an idtokens issued as "condor@xxxxxxx"?
Similarly, you can check by doing a condor_token_list on the alicesgm account.
Regards,
-- Geonmo
ëëìë : Maarten Litmaath via HTCondor-users <htcondor-users@xxxxxxxxxxx>
ëëìë : HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
ìì : Maarten Litmaath <Maarten.Litmaath@xxxxxxx>
ëìëì : 2025-02-11 (í) 09:13:56
ìë : Re: [HTCondor-users] v24.0.4 condor_submit only works sometimes
Hi Cole & Geonmo,here are my answers:
SCHEDD_HOST is neither defined on the v24.0.4 Submit Node (which fails),
nor on the v24.4.0 Submit Node (which works). Again, the configurations
come straight out of the Admin Quick Start Guide, nothing more.
The owner of a successfully submitted job is the local account under which
the job was submitted ("alicesgm").
With "-debug D_SECURITY" there does not seem to be much of a clue:[alicesgm@htc24s-ce ~]$ condor_submit -debug D_SECURITY my-test.jdlSubmitting job(s)02/11/25 00:41:50 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied)02/11/25 00:41:50 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxx[alicesgm@htc24s-ce ~]$
The various parameters:[alicesgm@htc24s-ce ~]$ condor_config_val -verbose ALLOW_WRITE SCHEDD_NAME \QUEUE_SUPER_USERS SEC_DEFAULT_AUTHENTICATION_METHODS \SEC_DAEMON_AUTHENTICATION_METHODS SEC_CLIENT_AUTHENTICATION_METHODS \ALLOW_DAEMON TRUST_DOMAINALLOW_WRITE = *# at: /etc/condor/config.d/01-submit.config, line 3, use SECURITY:recommended_v24_0+12# raw: ALLOW_WRITE = *Not defined: SCHEDD_NAME# at: <Default># raw: SCHEDD_NAME =QUEUE_SUPER_USERS = root, condor# at: <Default># raw: QUEUE_SUPER_USERS = root, condorSEC_DEFAULT_AUTHENTICATION_METHODS = FS,IDTOKENS,KERBEROS,SCITOKENS,SSL# at: <Default># raw: SEC_DEFAULT_AUTHENTICATION_METHODS = FS,IDTOKENS,KERBEROS,SCITOKENS,SSLNot defined: SEC_DAEMON_AUTHENTICATION_METHODSSEC_CLIENT_AUTHENTICATION_METHODS = FS,IDTOKENS,KERBEROS,SCITOKENS,SSL,ANONYMOUS# at: /etc/condor/config.d/01-submit.config, line 3, use SECURITY:get_htcondor_idtokens+9# raw: SEC_CLIENT_AUTHENTICATION_METHODS = $(SEC_DEFAULT_AUTHENTICATION_METHODS),ANONYMOUSALLOW_DAEMON = condor@* condor@password# at: /etc/condor/config.d/01-submit.config, line 3, use SECURITY:recommended_v24_0+10# raw: ALLOW_DAEMON = condor@* condor@passwordTRUST_DOMAIN = htc24s-cm.cern.ch# at: /etc/condor/config.d/01-submit.config, line 3, use SECURITY:get_htcondor_idtokens+20# raw: TRUST_DOMAIN = $(CONDOR_HOST)[alicesgm@htc24s-ce ~]$
And the tokens:[root@htc24s-ce ~]# condor_token_list && condor_token_list -dir /etc/condor-ce/tokens.dHeader: {"alg":"HS256","kid":"POOL"} Payload: {"iat":1739038309,"iss":"htc24s-cm.cern.ch","jti":"b5175124e4a8e4c41d4141e25e0b0633","sub":"condor@xxxxxxxxxxxxxxxxx"} File: /etc/condor/tokens.d/condor@xxxxxxxxxxxxxxxxxHeader: {"alg":"HS256","kid":"POOL"} Payload: {"iat":1739038309,"iss":"htc24s-cm.cern.ch","jti":"b5175124e4a8e4c41d4141e25e0b0633","sub":"condor@xxxxxxxxxxxxxxxxx"} File: /etc/condor-ce/tokens.d/condor@xxxxxxxxxxxxxxxxx[root@htc24s-ce ~]#
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Cole Bollig via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Monday, February 10, 2025 4:40 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Cole Bollig <cabollig@xxxxxxxx>
Subject: Re: [HTCondor-users] v24.0.4 condor_submit only works sometimesHi Maarten,In addition to the information Geonmo mentioned to check, is the configuration value SCHEDD_HOST defined in the configuration (condor_config_val -v SCHEDD_HOST) and when the job submission is success who is the owner in the job(s) ClassAd?Another thing that might be helpful/interesting is comparing the output of a successful and failed job submission when doing condor_submit -debug D_SECURITY <submit file>.-Cole Bollig
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of "ëêë" <geonmo@xxxxxxxxxxx>
Sent: Sunday, February 9, 2025 8:00 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] v24.0.4 condor_submit only works sometimesHello, Maarten.
Could you share some variables using condor_config_val?
- ALLOW_WRITE
- SCHEDD_NAME
- QUEUE_SUPER_USERS
- SEC_DEFAULT_AUTHENTICATION_METHODS (+ SEC_DAEMON_AUTHENTICATION_METHODS if it is existed,)
- SEC_CLIENT_AUTHENTICATION_METHODS
- ALLOW_DAEMON
- TRUST_DOMAIN
In addition, idtoken information?
[On root shell, condor_token_list && condor_token_list -dir /etc/condor-ce/tokens.d]
The error we experienced is a little different from the message you showed, but the user information of the IDTOKENS used by the HTCondor-CE Daemon was not in the ALLOW_WRITE list of HTCondor, so it was rejected.
I solved it by simply overwriting the IDTOKENS in /etc/condor/tokens.d/ with /etc/condor-ce/tokens.d/, but I don't know if it's the right solution.
However, it seems like this is an issue when submitting jobs to HTCondor via HTCondor-CE and not why HTCondor itself is not submitting.
Regards,
-- Geonmo
ââââââ ìë ëì ââââââëëìë : Maarten Litmaath via HTCondor-users <htcondor-users@xxxxxxxxxxx>
ëëìë : "htcondor-users@xxxxxxxxxxx" <htcondor-users@xxxxxxxxxxx>
ìì : Maarten Litmaath <Maarten.Litmaath@xxxxxxx>
ëìëì : 2025-02-09 (ì) 22:19:58
ìë : Re: [HTCondor-users] v24.0.4 condor_submit only works sometimes
Hi again,with an HTCondor CE installed in addition on the Submit Node,jobs are accepted by the CE, but refused by the latter's Schedd:02/09/25 14:07:15 (pid:10651) SetEffectiveOwner security violation:attempting to set owner to dis-allowed value alicesgm@xxxxxxxxxxxxxxxxxFurther advice would be appreciated, thanks!
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Maarten Litmaath via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Sunday, February 9, 2025 1:35 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Maarten Litmaath <Maarten.Litmaath@xxxxxxx>
Subject: Re: [HTCondor-users] v24.0.4 condor_submit only works sometimesHi again,using the current version in the Feature Channel, v24.4.0, all works fine,while the LTS Channel has the problem described below.We do not want to advise our sites to switch to the Feature Channel,because we usually prefer the stability of the LTS Channel...The two Channels appear to have some unwanted difference,for which I did not yet find a clue in the Feature Channel release notes...
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Maarten Litmaath via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Sunday, February 9, 2025 12:30 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Maarten Litmaath <Maarten.Litmaath@xxxxxxx>
Subject: [HTCondor-users] v24.0.4 condor_submit only works sometimesDear HTCondor experts,I have set up a v24.0.4 mini cluster on Alma 9 using the Admin Quick Start Guide:As an unprivileged user on the Submit Node, condor_submit fails as shown:======================================================================[alicesgm@htc24s-ce ~]$ cat my-test.jdlcmd = my-test.shoutput = my-test.out.$(ClusterId)error = my-test.err.$(ClusterId)log = my-test.log.$(ClusterId)+MaxMemory = 50queue 1[alicesgm@htc24s-ce ~]$ condor_submit my-test.jdlSubmitting job(s).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxx[alicesgm@htc24s-ce ~]$======================================================================If I keep trying, though, eventually it works:======================================================================[alicesgm@htc24s-ce ~]$ for i in `seq 30`; do condor_submit my-test.jdl &&break; sleep 61; done &>> log-$$.txt < /dev/null &[1] 33484[alicesgm@htc24s-ce ~]$ tail -f log-$$.txtSubmitting job(s).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxxSubmitting job(s).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxxSubmitting job(s).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxxSubmitting job(s).ERROR: Failed to commit job submission into the queue.ERROR: Failed to create new User record for condor@xxxxxxxxSubmitting job(s).1 job(s) submitted to cluster 19.======================================================================That job then runs fine, while the next job submission will fail again, etc.There appear to be two problems here:1) The Admin Quick Start Guide gives me a cluster that does not work.2) Due to some bug, job submissions sometimes get through nonetheless.Advice would be appreciated, thanks!