Hi Gavin,
Thanks for answering my questions. I had a chat with some of the other developers and tried to recreate your environment. I was able to achieve your desired setup with the default mini-condor configuration by setting up a token for remote submission. I did
this via the following steps (note had two terminal windows open to do this):
Host-------: $ docker run -it --rm -p 9618:9618 htcondor/mini:latest bash
Container-: # condor_master
Container-: # useradd cabollig
Container-: # su cabollig
Container-: $ condor_token_fetch
random-literal-string.blah.blah.blah
Host--------: $ mkdir -p ~/.condor/tokens.d/
Host--------: $ echo "random-literal-string.blah.blah.blah" > ~/.condor/tokens.d/docker
Host--------: $ chmod 600 ~/.condor/tokens.d/docker
One catch is to make sure you don't add a mount to /tmp in the container. If you do so, condor_token_fetch will fail FS authentication, and you won't have a token for the user to use for remote communication.
Cheers,
Cole Bollig
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Gavin Price <gaprice@xxxxxxx>
Sent: Friday, September 12, 2025 4:49 PM To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Submit job from host to htcondor/mini container using python bindings HI Todd and Cole,
Thanks for the responses. I'm going to consolidate here as there's some overlap.
> Just so I understand correctly, you are running a mini-condor in a container and trying to place jobs to the system from the host machine outside the container? Right
> Is there a reason that you need to place jobs from outside the container?
The eventual goal here is to add a feature to submit to HTC to an existing python based job running service. As such, that service will need to be able to remotely submit to HTC and this is the very first stumbling, baby-horse-like step towards that goal
> Where does the 99-NO_AUTH.config file come from? I.e. Did you find it online or did someone share it with you?
A combination of myself reading docs and AI trying to figure out why I was getting what appeared to be authentication failures when trying to run jobs. I'm not surprised if it's whackadoo. I was just trying to shut off as much authentication as possible
to get to a point where I could at least submit a job.
> Here is what I did to run your submit a sleep job test
The difference here is that I was running the python code on the host, outside the container, so a no argument Schedd() won't find the service
> I suggest you try again just without customizing the configuration, the defaults should be fine.
Here's what happens without the 99-NO_AUTH.config file either in the container or in `htcondor.param`:
``` $ docker run -d -p 9618:9618 htcondor/mini:24.11.2-el9
ce4829266267392b96c7d5bdf6069701ab09301bb81836902787b04033e6ef34 Then in ipython on the host:
In [1]: import htcondor
/home/crushingismybusiness/github/kbase/cdm-task-service/.venv/lib/python3.12/site-packages/htcondor/__init__.py:49: UserWarning: Neither the environment variable CONDOR_CONFIG, /etc/condor/, /usr/local/etc/, nor ~condor/ contain a condor_config source. Therefore, we are using a null condor_config. _warnings.warn(message) In [2]: collector = htcondor.Collector("172.17.0.2:9618") In [3]: schedd_ad = collector.locate(htcondor.DaemonTypes.Schedd) In [4]: schedd_ad["MyAddress"] Out[4]: '<172.17.0.2:9618?addrs=172.17.0.2-9618&alias=ce4829266267&noUDP&sock=schedd_18_eccb>' In [5]: schedd = htcondor.Schedd(schedd_ad) In [6]: sub = htcondor.Submit({ ...: "executable": "/bin/sleep", ...: "arguments": "30", ...: "output": "/tmp/sleep.out", ...: "error": "/tmp/sleep.err", ...: "log": "/tmp/sleep.log", ...: }) In [7]: cluster_id = schedd.submit(sub) The remote host ce4829266267 presented an untrusted CA certificate with the following fingerprint: SHA-256: e8:24:0b:bd:b1:4e:9b:9f:5d:f8:04:3f:47:19:61:1b:a9:15:0c:a6:ad:16:b9:71:25:63:82:f9:1a:a2:c7:29 Subject: /O=condor/CN=ce4829266267 Would you like to trust this server for current and future communications? Please type 'yes' or 'no': yes --------------------------------------------------------------------------- HTCondorIOError Traceback (most recent call last) Cell In[7], line 1 ----> 1 cluster_id = schedd.submit(sub) File ~/github/kbase/cdm-task-service/.venv/lib/python3.12/site-packages/htcondor/_lock.py:70, in add_lock.<locals>.wrapper(*args, **kwargs) 67 try: 68 acquired = LOCK.acquire() ---> 70 rv = func(*args, **kwargs) 72 # if the function returned a context manager, 73 # create a LockedContext to manage the lock 74 is_cm = is_context_manager(rv) HTCondorIOError: Failed to connect to schedd. In SchedLog I see:
09/12/25 21:15:15 (pid:38) TransferQueueManager stats: active up=0/100 down=0/100; waiting up=0 down=0; wait time up=0s down=0s
09/12/25 21:15:15 (pid:38) TransferQueueManager upload 1m I/O load: 0 bytes/s 0.000 disk load 0.000 net load 09/12/25 21:15:15 (pid:38) TransferQueueManager download 1m I/O load: 0 bytes/s 0.000 disk load 0.000 net load 09/12/25 21:19:42 (pid:38) DC_AUTHENTICATE: authentication of <172.17.0.1:43497> did not result in a valid mapped user name, which is required for this command (1112 QMGMT_WRITE_CMD), so aborting. ```
That's what I tried at first, which led to the 99-NO_AUTH file (and a bunch of other stuff I tried) to attempt to fix both the cert check and the DC_AUTHENTICATE issue
I *think* that covers all of your questions, but please let me know if I missed anything or if what I'm trying to do is unclear. Or if the approach I'm taking is completely off base for that matter
Thanks very much for the help so far and your time, much appreciated
Gavin
On Fri, Sep 12, 2025 at 12:34âPM Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
|