[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Env vars to make htcondor.Schedd() Just Work?



Hi Todd,

I made a minimal docker compose setup so you can see exactly what I'm doing and play around with it yourself if you want:
https://github.com/MrCreosote/htcondor-mini-test-docker-compose/tree/main

The python bindings test works the same way as above - argumentless Schedd() and Collector() calls result in a failure but explicitly providing the collector host allows jobs to run with the IDTOKEN.

Frustratingly, I've been unableÂto reproduce the CLI behavior from earlier in a clean image - the schedd address from condor_status still looks good, but

* With just the collector host env var specified, it tries to find the local shedd. Previously it was able to connect to the remote schedd.
* With the shedd host also specified, it gets a connection closed error.

So it seems for some reason the commands aren't getting the schedd address from the collector?

Thanks again for your advice so far,

-g

On Fri, Sep 19, 2025 at 5:16âPM Gavin Price <gaprice@xxxxxxx> wrote:
Hi Todd,

Thanks very much for the reply. It seems part of the problem (maybe) is that I didn't have the main htcondor binaries installed, just the python bindings. I assumed the python bindings installed all the necessary dependencies since

* The docs don't mention you need to install the main binaries (https://htcondor.readthedocs.io/en/24.x/apis/python-bindings/install.html)
* I'm able to submit and successfullyÂcomplete jobs with just the bindings as long as I explicitlyÂsupply the host to the Collector() constructor.
  * Setting _CONDOR_COLLECTOR_HOST doesn't seem to affect the need forÂthisÂ- it has to be explicitlyÂset.

When I install the main htcondor binaries usingÂhttps://htcondor.readthedocs.io/en/latest/getting-htcondor/install-linux-as-root.html, I can get what looks like the correct schedd address:

root@720cc58faf3f:/cts# _CONDOR_COLLECTOR_HOST=htcondor-mini:9618 condor_status -schedd -af Name MyAddress
36956b0aed50 <172.18.0.5:9618?addrs=172.18.0.5-9618&alias=36956b0aed50&noUDP&sock=schedd_21_fb94>

However, even though condor sees my token, it seemingly tries to run the job as a different user, even if I try to force token auth:

root@720cc58faf3f:/cts# condor_token_list
Header: {"alg":"HS256","kid":"POOL"} Payload: {"iat":1758317879,"iss":"36956b0aed50","jti":"4b3f60d62f94f85596ebb4dc8a5451e7","sub":"submituser@36956b0aed50"} File: /etc/condor/tokens.d/mini.token
root@720cc58faf3f:/cts# _CONDOR_COLLECTOR_HOST=htcondor-mini:9618 condor_submit ./sleep.submit
Submitting job(s)
ERROR: Failed to create new User record for condor@720cc58faf3f.
The given user is not allowed to own jobs
root@720cc58faf3f:/cts# _CONDOR_COLLECTOR_HOST=htcondor-mini:9618 _CONDOR_SEC_CLIENT_AUTHENTICATION_METHODS=IDTOKENS condor_submit ./sleep.submit
Submitting job(s)
ERROR: Failed to create new User record for condor@password.
The given user is not allowed to own jobs

... which is really odd, since clearly the token *does* work when just the python bindingsÂare installed.

Do I have to have the main condor binaries installed to use the python libs? I'd prefer to avoid that if possible

I tested the code you provided *without* the main htcondor binaries installed to see if it changed behavior from what I've observed but there's no change for me (see below)

Thanks very much for the help so far,

-g

```
In [1]: import os

In [2]: os.environ["_CONDOR_COLLECTOR_HOST"] = "htcondor-mini:9618"

In [3]: import htcondor2
/usr/local/lib/python3.12/site-packages/htcondor2/__init__.py:27: UserWarning: The environment variable CONDOR_CONFIG is unset and none of the default locations contain a condor_config file. Using /dev/null, instead.
 _warnings.warn(message)

In [4]: schedd = htcondor2.Schedd()
---------------------------------------------------------------------------
HTCondorException             Traceback (most recent call last)
Cell In[4], line 1
----> 1 schedd = htcondor2.Schedd()

File /usr/local/lib/python3.12/site-packages/htcondor2/_schedd.py:114, in Schedd.__init__(self, location)
  112 if location is None:
  113   c = Collector()
--> 114 Â Â location = c.locate(DaemonType.Schedd)
  116 if not isinstance(location, classad.ClassAd):
  117   raise TypeError("location must be a ClassAd")

File /usr/local/lib/python3.12/site-packages/htcondor2/_collector.py:179, in Collector.locate(self, daemon_type, name)
  176     return None
  177   return list[0]
--> 179 return _collector_locate_local(self, self._handle, int(daemon_type))

HTCondorException: Unable to locate local daemon.

In [5]: collector = htcondor2.Collector()

In [7]: schedd_ad = collector.locate(htcondor2.DaemonTypes.Schedd)
---------------------------------------------------------------------------
HTCondorException             Traceback (most recent call last)
Cell In[7], line 1
----> 1 schedd_ad = collector.locate(htcondor2.DaemonTypes.Schedd)

File /usr/local/lib/python3.12/site-packages/htcondor2/_collector.py:179, in Collector.locate(self, daemon_type, name)
  176     return None
  177   return list[0]
--> 179 return _collector_locate_local(self, self._handle, int(daemon_type))

HTCondorException: Unable to locate local daemon.

In [8]: collector = htcondor2.Collector(os.environ["_CONDOR_COLLECTOR_HOST"])

In [9]: schedd_ad = collector.locate(htcondor2.DaemonTypes.Schedd)

In [10]: schedd_ad["MyAddress"]
Out[10]: '<172.18.0.5:9618?addrs=172.18.0.5-9618&alias=36956b0aed50&noUDP&sock=schedd_21_fb94>'

In [12]: sub = htcondor2.Submit({
  ...:   "executable": "/bin/sleep",
  ...:   "initialdir": "/tmp",
  ...:   "arguments": "30",
  ...:   "output": "/tmp/sleep.out",
  ...:   "error": "/tmp/sleep.err",
  ...:   "log": "/tmp/sleep.log",
  ...: })

In [14]: schedd = htcondor2.Schedd(schedd_ad)

In [15]: clid = schedd.submit(sub)

In [16]: clid.cluster()
Out[16]: 3
```

On Fri, Sep 19, 2025 at 3:28âPM Todd L Miller via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
    I realized I should send along the script I actually tested. It
didn't succeed, because I haven't gotten around to fixing the
authentication issue, but commenting out the SCHEDD_HOST line changes the
error message from one about SSL to one about not being able to locate the
"local" schedd, which ought to be sufficient to prove the point.

    Note that in these examples I'm actually using an FQDN for
SCHEDD_HOST, which sometimes matters (even though it shouldn't). You can
try setting SCHEDD_NAME in the container to a string containing an '@'
sign to bypass any weirdness there. (Restart HTCondor after doing so.)

-- ToddM

#!/usr/bin/env python3

import os

os.environ['_CONDOR_COLLECTOR_HOST'] = "minicondor:9618"
os.environ['_CONDOR_SCHEDD_HOST'] = "minicondor"

import htcondor2

s = htcondor2.Schedd()

sub = htcondor2.Submit({
  Â"executable": Â"/bin/sleep",
  Â"initialdir": Â"/tmp",
  Â"arguments":  "30",
  Â"output":   Â"/tmp/sleep.out",
  Â"error":    "/tmp/sleep.err",
  Â"log":     "/tmp/sleep.log",
})

s.submit(sub)
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/