[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] X509 error: "unsupported version" when submitting jobs with a token and a proxy: v9 to v24



HTCondor still supports the use of X.509 proxies, both for use by the job and for authentication to the HTCondor daemons (using the SSL method with appropriate configuration). This error looks like a problem we very recently fixed where OpenSSL 3.4.0 doesnât like the Certificate Signing Request HTCondor generates as part of delegating the proxy. (Credit to John Thiltges and the ARC CE team for the fix).

Details are here:
https://opensciencegrid.atlassian.net/browse/HTCONDOR-2904

The bug fix will be included in the next regular set of HTCondor releases. In the mean time, you can configure HTCondor to copy X.509 proxy files over the network instead of delegating them, by setting this now-misleadingly-named configuration parameter:

DELEGATE_JOB_GSI_CREDENTIALS = False

 - Jaime

On Mar 7, 2025, at 10:10âAM, Maarten Litmaath via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

Hi Alexandre,
perhaps the issue arises due to this line:

universe = vanilla

ALICE jobs have:

universe = grid

They are submitted to a local HTCondor cluster, which then takes
care of submitting the job to the CE that was indicated in the JDL:

grid_resource = condor ce.some.domain ce.some.domain:9619



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Alexandre Boyer via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Friday, March 7, 2025 11:45 AM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Alexandre Franck Boyer <alexandre.franck.boyer@xxxxxxx>
Subject: [HTCondor-users] X509 error: "unsupported version" when submitting jobs with a token and a proxy: v9 to v24
 
Dear HTCondor experts,

I hope you are doing well!

Context:
========

I have the following submission script:

```
# Environment
# -----------
universe = vanilla

# Inputs/Outputs
# --------------
# Inputs: executable to submit
executable = /tmp/script.sh

# Directory that will contain the outputs
initialdir = /tmp/initial_dir

# Outputs: stdout, stderr, log
output = $(Cluster).$(Process).out
error = $(Cluster).$(Process).err
log = $(Cluster).$(Process).log

# No other files are to be transferred
transfer_output_files = ""

# Transfer outputs, even if the job is failed
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT

# Environment variables to pass to the job
environment = "PILOT_STAMP=$(stamp) HTCONDOR_JOBID=$(Cluster).$(Process)"

# Credentials
# -----------
use_x509userproxy = true
use_scitokens = true
scitokens_file = /tmp/token.token

# Requirements
# ------------
request_cpus = 1

# Exit options
# ------------
# Specify the signal sent to the job when HTCondor needs to vacate the
worker node
kill_sig=SIGTERM
# By default, HTCondor marked jobs as completed regardless of its status
# This option allows to mark jobs as Held if they don't finish successfully
=!= 0
# A subcode of our choice to identify who put the job on hold
> # Jobs are then deleted from the system after N days if they are not
idle or running
periodic_remove = (JobStatus != 1) && (JobStatus != 2) && ((time() -
EnteredCurrentStatus) > (1 * 24 * 3600))

Queue stamp in d962af4e2da4895439e94e5c01a1a305
```

I am submitting this JDL to various HTCondor instances with the
following environment variables:

```
X509_USER_PROXY=/tmp/tmpcir64bnk
_CONDOR_SEC_CLIENT_AUTHENTICATION_METHODS="SCITOKENS"
_CONDOR_SCITOKENS_FILE=/tmp/token.token
...
```

The authentication is done through SCITOKENS but I still need to include
a proxy.

Problem:
========

I have been using condor v9.0 for years, everything has been fine.
I recently decided to upgrade to condor v24 but started to get the
following error when submitting jobs:

```
$ condor_submit -terse -pool <htcondor instance>:9619 -remote <htcondor
instance> -debug
03/05/25 16:30:54 Delegation error: 4068D7FDB37F0000:error:05800091:x509
certificate routines:X509_REQ_verify_ex:unsupported
version:crypto/x509/x_all.c:47:

03/05/25 16:30:54 Delegation error:
03/05/25 16:30:54 ReliSock::put_x509_delegation(): delegation failed:
X509Credential::Delegate() failed
03/05/25 16:30:54 Transfer exit info: Success = False | Error[13.115] =
'|Error: sending file /tmp/tmpcir64bnk' | Ack = DOWNLOAD | Line = 5482 |
Files = 0 | Retry = True
03/05/25 16:30:54 DoUpload: SUBMIT at 188.185.73.26 failed to send
file(s) to <htcondor_instance:9619>: |Error: sending file
/tmp/tmpcir64bnk; SCHEDD at <htcondor_instance> - |Error: receiving file
/var/lib/condor-ce/spool/4429/0/cluster2214429.proc0.subproc0.tmp/tmpcir64bnk

DCSchedd::spoolJobFiles:7002:File transfer failed for target job
2214429.0: SUBMIT at <address> failed to send file(s) to
<htcondorinstance:9619>: |Error: sending file /tmp/tmpcir64bnk; SCHEDD
at 131.169.223.136 - |Error: receiving file
/var/lib/condor-ce/spool/4429/0/cluster2214429.proc0.subproc0.tmp/tmpcir64bnk

ERROR: Failed to spool job files.
```

Have you ever seen that?
In the changelog (https://htcondor.org/htcondor/release-highlights/), I
see an entry that seems related in the 9.2.0 release:

```
Fix problem where proxy delegation to older HTCondor versions failed
```

The error is triggered by the "use_x509userproxy" option.

Attempts to solve the issue:
============================

- Removing "use_x509userproxy" option from the JDL:
   - "fixes" the issue, but sites still need to get the token along with
a proxy so I can't just drop it.
   - or at least they need to get VOMS attributes from it, so may be
adding the VOMS attributes to the JDL is a possibility but this needs to
be discussed.

- Replacing "use_x509userproxy" option with "x509userproxy =
"/tmp/tmpcir64bnk"
   - leads to the same issue

- Checking the validity of the used proxy:
   - `voms-proxy-info -all -file <proxy>` gives me the details of the
proxy correctly
   - The version of the proxy seems fine `Version: 3 (0x2)`
   - But the error does not seem to reference the version of the proxy
but the one of a CSR:
https://github.com/openssl/openssl/blob/master/crypto/x509/x_all.c#L43C1-L49C6


Thanks a lot for your support!
Should you need any further details, please let me know.

Best regards,
Alexandre Boyer

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!ItZJTXaCsuMjMhgOUB99s9YGNFzvxksoa2gf96hkn3DE3RzfBmWZsE8sKuhDx-yl_wIiIvk0tPJwN2VsfP-rXu4IV9hHKw$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/