[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Env vars to make htcondor.Schedd() Just Work?



I should probably mention that if I don't include the SCHEDD_HOST env var and config file items, then things fail earlier - it seems that SCHEDD_HOST makes the collector just use that value rather than looking it up.

I still don't understand how to get Collector() to use the CONDOR_HOST or COLLECTOR_HOST env var / config file items though, from the docs it seems like those should work.

-g

On 9/18/25 16:52, Gavin Price wrote:
Hi all,

I'm still trying to figure this out but thought I'd post progress so far so I don't waste time if someone looks at my old message. I'm getting some behavior changes by setting more configs in env vars and a config file, but I still haven't found a way to make htcondor.Schedd() alone connect to the remote schedd instance correctly. It seems my config is allowing the collector to return an address for the schedd, but it's missing important info in the query parameters, I guess?

Anyway, here's where I'm at. If anyone has any advice that'd be really helpful

```
In [1]: import os

In [2]: os.environ["_CONDOR_CONDOR_HOST"]
Out[2]: 'htcondor-mini:9618'

In [3]: os.environ["CONDOR_HOST"]
Out[3]: 'htcondor-mini:9618'

In [4]: os.environ["_CONDOR_COLLECTOR_HOST"]
Out[4]: 'htcondor-mini:9618'

In [5]: os.environ["COLLECTOR_HOST"]
Out[5]: 'htcondor-mini:9618'

In [6]: os.environ["_CONDOR_SCHEDD_HOST"]
Out[6]: 'htcondor-mini:9618'

In [7]: os.environ["SCHEDD_HOST"]
Out[7]: 'htcondor-mini:9618'

In [8]: !cat $CONDOR_CONFIG
COLLECTOR_HOST=htcondor-mini:9618
CONDOR_HOST=htcondor-mini:9618
SCHEDD_HOST=htcondor-mini:9618

In [9]: import htcondor

In [10]: schedd = htcondor.Schedd()

In [11]: sub = htcondor.Submit({
  ...:   "executable": "/bin/sleep",
  ...:   "initialdir": "/tmp",
  ...:   "arguments": "30",
  ...:   "output": "/tmp/sleep.out",
  ...:   "error": "/tmp/sleep.err",
  ...:   "log": "/tmp/sleep.log",
  ...: })

In [12]: clid = schedd.submit(sub)
---------------------------------------------------------------------------
HTCondorIOError              Traceback (most recent call last)
Cell In[12], line 1
----> 1 clid = schedd.submit(sub)

File /usr/local/lib/python3.12/site-packages/htcondor/_lock.py:70, in add_lock.<locals>.wrapper(*args, **kwargs)
  Â67 try:
  Â68   acquired = LOCK.acquire()
---> 70 Â Â rv = func(*args, **kwargs)
  Â72   # if the function returned a context manager,
  Â73   # create a LockedContext to manage the lock
  Â74   is_cm = is_context_manager(rv)

HTCondorIOError: Failed to connect to schedd.

In [13]: collector = htcondor.Collector()

In [14]: schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd)

In [15]: schedd_ad["MyAddress"]
Out[15]: '<172.18.0.5:9618?alias=htcondor-mini>'

In [16]: schedd = htcondor.Schedd(schedd_ad)

In [17]: clid = schedd.submit(sub)
---------------------------------------------------------------------------
HTCondorIOError              Traceback (most recent call last)
Cell In[17], line 1
----> 1 clid = schedd.submit(sub)

File /usr/local/lib/python3.12/site-packages/htcondor/_lock.py:70, in add_lock.<locals>.wrapper(*args, **kwargs)
  Â67 try:
  Â68   acquired = LOCK.acquire()
---> 70 Â Â rv = func(*args, **kwargs)
  Â72   # if the function returned a context manager,
  Â73   # create a LockedContext to manage the lock
  Â74   is_cm = is_context_manager(rv)

HTCondorIOError: Failed to connect to schedd.

In [18]: collector = htcondor.Collector(os.environ["COLLECTOR_HOST"])

In [19]: schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd)

In [20]: schedd_ad["MyAddress"]
Out[20]: '<172.18.0.5:9618?addrs=172.18.0.5-9618&alias=71ad2f98e72a&noUDP&sock=schedd_21_fb94>'

In [21]: schedd = htcondor.Schedd(schedd_ad)

In [22]: clid = schedd.submit(sub)

In [23]: clid.cluster()
Out[23]: 6
```

Thianks in advance for any ideas,

-g

On Tue, Sep 16, 2025 at 7:40âPM Gavin Price <gaprice@xxxxxxx> wrote:
Hi all,Â

I'm setting up the htcondor python client in a docker container with the intent of submitting jobs to a remote cluster. I'm trying to figure out what environmental variables I need to set to not need to specify a schedd_ad for Sehedd() (and a host for Collector(), which ideally wouldn't be needed). I've set the CONDOR_HOST env var (see below) which is what I would expect to work but in my hands it doesn't.Â

Below is an example of what I want to do vs what actually works:

* htcondor.Schedd() - doesn't work, this is what I want to work
* htcondor.Collector() -> schedd_ad - doesn't work
* htcondor.Collector(<host>) -> schedd_ad -> Schedd(schedd_ad) - works

What env vars do I need to set so the first line works?

Here's what I'm doing:

```
n [1]: import os

In [2]: os.environ["_CONDOR_CONDOR_HOST"]
Out[2]: 'htcondor-mini:9618'

In [3]: os.environ["CONDOR_HOST"]
Out[3]: 'htcondor-mini:9618'

In [4]: import htcondor
/usr/local/lib/python3.12/site-packages/htcondor/__init__.py:49: UserWarning: Neither the environment variable CONDOR_CONFIG, /etc/condor/, /usr/local/etc/, nor ~condor/ contain a condor_config source. Therefore, we are using a null condor_config.
 _warnings.warn(message)

### Schedd() only

In [5]: schedd = htcondor.Schedd()
---------------------------------------------------------------------------
HTCondorLocateError            Traceback (most recent call last)
Cell In[5], line 1
----> 1 schedd = htcondor.Schedd()

File /usr/local/lib/python3.12/site-packages/htcondor/_lock.py:70, in add_lock.<locals>.wrapper(*args, **kwargs)
  Â67 try:
  Â68   acquired = LOCK.acquire()
---> 70 Â Â rv = func(*args, **kwargs)
  Â72   # if the function returned a context manager,
  Â73   # create a LockedContext to manage the lock
  Â74   is_cm = is_context_manager(rv)

HTCondorLocateError: Unable to locate local daemon

### schedd_ad from Collector()

In [7]: collector = htcondor.Collector()

In [8]: schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd)
---------------------------------------------------------------------------
HTCondorLocateError            Traceback (most recent call last)
Cell In[8], line 1
----> 1 schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd)

File /usr/local/lib/python3.12/site-packages/htcondor/_lock.py:70, in add_lock.<locals>.wrapper(*args, **kwargs)
  Â67 try:
  Â68   acquired = LOCK.acquire()
---> 70 Â Â rv = func(*args, **kwargs)
  Â72   # if the function returned a context manager,
  Â73   # create a LockedContext to manage the lock
  Â74   is_cm = is_context_manager(rv)

HTCondorLocateError: Unable to locate local daemon

### Full setup

In [9]: collector = htcondor.Collector("htcondor-mini:9618")

In [10]: schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd)

In [11]: schedd = htcondor.Schedd(schedd_ad)

In [12]: sub = htcondor.Submit({
  ...:  Â...:   "executable": "/bin/sleep",
  ...:  Â...:   "initialdir": "/tmp",
  ...:  Â...:   "arguments": "30",
  ...:  Â...:   "output": "/tmp/sleep.out",
  ...:  Â...:   "error": "/tmp/sleep.err",
  ...:  Â...:   "log": "/tmp/sleep.log",
  ...:  Â...: })

In [13]: cluster_id = schedd.submit(sub)

In [15]: cluster_id.cluster()
Out[15]: 2
```

Thanks in advance for any advice,

-g