[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] condor_drain error



I have a test condor pool with two nodes (installed using `get_htcondor`):
* Node A (hostname `condor-master`; IP 192.168.9.7) has `use role:get_htcondor_central_manager` and `use role:get_htcondor_submit` roles
* Node B (hostname `condor-execute`; IP 192.168.9.161) has `use role:get_htcondor_execute`

I ran `systemctl status condor` onÂboth nodes to confirm that the master has condor_collector, condor_negotiator, and condor_schedd running and that the execute nodeÂhas condor_startd running.

Both nodes have `use security:recommended` in `/etc/condor/config.d/00-security`.

I have run `condor_status` and confirmed that the execute machine has joined the poolÂand I have run `condor_submit` with a test script and confirmed that the job ran.

I have copied the `/etc/condor/passwords.d/POOL` file from the master node to the execute node and confirmed 0600 permissions for that file on both nodes.

I have runÂ`condor_token_request` on the master node and approved the request on the execute node using `condor_token_request_approve` and stored the returned key in `/etc/condor/tokens.d/admin@condor` on the master node.

I then ran `condor_drain -debug 192.168.9.161` but gotÂthis error: "Can't find address for startd 192.168.9.161". I thought this could be related to using the IP address instead of a DNS name so I added an entry to my /etc/hosts file and ran `condor_drain -debu condor-execute` but I got this error: "ERROR: unknown host condor-execute"

Conceptually, I feel like I understand how IDTOKEN auth is supposed to work (and to some extent, I think it is working since the execute machine was able to join the pool), but I can't figure out why `condor_drain` won't work.

Thanks,

Curtis