[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] X509 error: "unsupported version" when submitting jobs with a token and a proxy: v9 to v24



Hi Thomas,


Thanks for your response!

> with the transition to tokens, the support for Globus has been deprecated. I guess that the authz flow behind x509userproxy has been dropped for good with v24.

Alright good to know, I don't see anything related to that in the changelog.

The options related to proxies are still part of the documentation: 

- https://htcondor.readthedocs.io/en/24.0/man-pages/condor_submit.html#use_x509userproxy

- https://htcondor.readthedocs.io/en/24.0/man-pages/condor_submit.html#x509userproxy

> If you need a proxy in the job, I would stage it like any other file and set the environment variable accordingly.

As a user I don't need to have a proxy along with the job, but WLCG site administrators need to extract VOMS attributes from it. I think they rely on the following mechanism: "x509userproxy is relevant when the universe is vanilla, or when the universe is grid and the type of grid system is one of condor, or arc. Defining a value causes the proxy to be delegated to the execute machine. Further, VOMS attributes defined in the proxy will appear in the job ClassAd." - https://htcondor.readthedocs.io/en/24.0/man-pages/condor_submit.html#x509userproxy


Thanks,

Alexandre Boyer





On 3/7/25 13:22, Thomas Hartmann wrote:
Hi Alexandre,

with the transition to tokens, the support for Globus has been deprecated. I guess that the authz flow behind x509userproxy has been dropped for good with v24.

If you need a proxy in the job, I would stage it like any other file and set the environment variable accordingly. We suggest our local users top follow along the lines of

  transfer_input_files  = YOURPROXYFILE.PEM
  environment = "X509_USER_PROXY=${HOME}/YOURPROXYFILE.PEM"

when they need a proxy within their jobs' contexts.

Cheers,
  Thomas

On 07/03/2025 11.45, Alexandre Boyer via HTCondor-users wrote:
Dear HTCondor experts,

I hope you are doing well!

Context:
========

I have the following submission script:

```
# Environment
# -----------
universe = vanilla

# Inputs/Outputs
# --------------
# Inputs: executable to submit
executable = /tmp/script.sh

# Directory that will contain the outputs
initialdir = /tmp/initial_dir

# Outputs: stdout, stderr, log
output = $(Cluster).$(Process).out
error = $(Cluster).$(Process).err
log = $(Cluster).$(Process).log

# No other files are to be transferred
transfer_output_files = ""

# Transfer outputs, even if the job is failed
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT

# Environment variables to pass to the job
environment = "PILOT_STAMP=$(stamp) HTCONDOR_JOBID=$(Cluster).$(Process)"

# Credentials
# -----------
use_x509userproxy = true
use_scitokens = true
scitokens_file = /tmp/token.token

# Requirements
# ------------
request_cpus = 1

# Exit options
# ------------
# Specify the signal sent to the job when HTCondor needs to vacate the worker node
kill_sig=SIGTERM
# By default, HTCondor marked jobs as completed regardless of its status
# This option allows to mark jobs as Held if they don't finish successfully
=!= 0
# A subcode of our choice to identify who put the job on hold

# Jobs are then deleted from the system after N days if they are not idle or running
periodic_remove = (JobStatus != 1) && (JobStatus != 2) && ((time() - EnteredCurrentStatus) > (1 * 24 * 3600))

Queue stamp in d962af4e2da4895439e94e5c01a1a305
```

I am submitting this JDL to various HTCondor instances with the following environment variables:

```
X509_USER_PROXY=/tmp/tmpcir64bnk
_CONDOR_SEC_CLIENT_AUTHENTICATION_METHODS="SCITOKENS"
_CONDOR_SCITOKENS_FILE=/tmp/token.token
...
```

The authentication is done through SCITOKENS but I still need to include a proxy.

Problem:
========

I have been using condor v9.0 for years, everything has been fine.
I recently decided to upgrade to condor v24 but started to get the following error when submitting jobs:

```
$ condor_submit -terse -pool <htcondor instance>:9619 -remote <htcondor instance> -debug
03/05/25 16:30:54 Delegation error: 4068D7FDB37F0000:error:05800091:x509 certificate routines:X509_REQ_verify_ex:unsupported version:crypto/x509/ x_all.c:47:

03/05/25 16:30:54 Delegation error:
03/05/25 16:30:54 ReliSock::put_x509_delegation(): delegation failed: X509Credential::Delegate() failed
03/05/25 16:30:54 Transfer exit info: Success = False | Error[13.115] = '|Error: sending file /tmp/tmpcir64bnk' | Ack = DOWNLOAD | Line = 5482 | Files = 0 | Retry = True
03/05/25 16:30:54 DoUpload: SUBMIT at 188.185.73.26 failed to send file(s) to <htcondor_instance:9619>: |Error: sending file /tmp/ tmpcir64bnk; SCHEDD at <htcondor_instance> - |Error: receiving file / var/lib/condor-ce/spool/4429/0/cluster2214429.proc0.subproc0.tmp/ tmpcir64bnk

DCSchedd::spoolJobFiles:7002:File transfer failed for target job 2214429.0: SUBMIT at <address> failed to send file(s) to <htcondorinstance:9619>: |Error: sending file /tmp/tmpcir64bnk; SCHEDD at 131.169.223.136 - |Error: receiving file /var/lib/condor-ce/ spool/4429/0/cluster2214429.proc0.subproc0.tmp/tmpcir64bnk
ERROR: Failed to spool job files.
```

Have you ever seen that?
In the changelog (https://htcondor.org/htcondor/release-highlights/), I see an entry that seems related in the 9.2.0 release:

```
Fix problem where proxy delegation to older HTCondor versions failed
```

The error is triggered by the "use_x509userproxy" option.

Attempts to solve the issue:
============================

- Removing "use_x509userproxy" option from the JDL:
   - "fixes" the issue, but sites still need to get the token along with a proxy so I can't just drop it.
   - or at least they need to get VOMS attributes from it, so may be adding the VOMS attributes to the JDL is a possibility but this needs to be discussed.

- Replacing "use_x509userproxy" option with "x509userproxy = "/tmp/ tmpcir64bnk"
   - leads to the same issue

- Checking the validity of the used proxy:
   - `voms-proxy-info -all -file <proxy>` gives me the details of the proxy correctly
   - The version of the proxy seems fine `Version: 3 (0x2)`
   - But the error does not seem to reference the version of the proxy but the one of a CSR: https://github.com/openssl/openssl/blob/master/ crypto/x509/x_all.c#L43C1-L49C6


Thanks a lot for your support!
Should you need any further details, please let me know.

Best regards,
Alexandre Boyer

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/ htcondor-users/