[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problem with scitokens-cpp v1.3.0 and HTCondor-CEs



Hi Jaime,

 

The cache dir was the default for us, yes. As I said, I donât know how it ended up owned by root, especially because it happened only in a reduced subset of our CEs. One possible explanation would be having run the daemon as root, I guess, but I canât see how we could have done that.  

 

Even weirdest is that the same thing happened to CMS APs, although in their case I think the cache dir was not the default one, from what Stefano told me.

 

Cheers,

    Antonio

 

 

From: Jaime Frey <jfrey@xxxxxxxxxxx>
Sent: Wednesday, February 18, 2026 8:55 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Antonio Delgado Peris <Antonio.Delgado.Peris@xxxxxxx>
Subject: Re: [HTCondor-users] Problem with scitokens-cpp v1.3.0 and HTCondor-CEs

 

What was the value of SEC_SCITOKENS_CACHE on the hosts where it was owned by root? If itâs the default of /var/run/condor-ce/cache, Iâm left wondering how the directory got created owned by root.

 

 - Jaime



On Feb 18, 2026, at 8:50âAM, Antonio Delgado Peris via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

 

Hi again, 

 

We found the problem. The hosts that were failing had the $SEC_SCITOKENS_CACHE owned by root, and empty. The hosts that were working had it owned by condor and containing a `scitokens` directory. Apparently, previous versions either didnât use the cache or ignore it if it was not there. Newest version seems to hang there waiting for the cache until there is the timeout.

 

Changing ownership to condor solves the problem. Stefano also confirmed it worked for him.

 

Now, why was that dir owned by root? I donât know. If I delete the dir, condor creates it on the fly with condor ownership. It must have been there already before the package update, but I donât know how it happened.

 

Probably the package could do a better job bypassing the cache if it is not usable, or at least reporting the problem more clearly.

 

Cheers,

    Antonio

 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Antonio Delgado Peris via HTCondor-users
Sent: Wednesday, February 18, 2026 10:56 AM
To: HTCondor-Users Mail List <
htcondor-users@xxxxxxxxxxx>
Cc: Antonio Delgado Peris <
Antonio.Delgado.Peris@xxxxxxx>
Subject: [HTCondor-users] Problem with scitokens-cpp v1.3.0 and HTCondor-CEs

 

Hello!

 

At CERN, we have observed that some our CEs were failing all scitoken validations after the package scitokens-cpp was updated to v1.3.0. (and, apparently, after a restart of the service). The symptom is the following entry in the CE's SchedLog:

 

02/17/26 16:22:09 DC_AUTHENTICATE: reason for authentication failure: AUTHENTICATE:1006:exceeded 1771341689 deadline during authentication|SCITOKENS:2:Failed to verify token and generate ACLs: Timeout when loading the OIDC metadata.|AUTHENTICATE:1004:Failed to authenticate using IDTOKENS|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to lstat(/tmp/FS_XXXT6KLMa)

 

We see the timeout error message comes from: scitokens-cpp/src/scitokens_internal.h Line 838 in bd686d1

"Timeout when loading the OIDC metadata.");

 

It hasn't affected all CEs only a few of them. We don't know what other condition triggers the problem. However, by downgrading the package to scitokens-cpp-1.1.3, the problem has been solved in all cases.

 

Has somebody else seen something similar? Any idea about what we could do to further debug this issue?

 

PS: I have also reported this at https://github.com/scitokens/scitokens-cpp/issues/202

 

Cheers,
Antonio

 

 

 

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at:
https://www-auth.cs.wisc.edu/lists/htcondor-users/