I see that our documentation about flocking is confusing and the configuration details are out-of-date. I will need to work on improving those. In the mean time, I will give a better explanation here.
Flocking is a way for an Access Point (i.e. a condor_schedd) to find machines to run all of its jobs in HTCondor pools beyond its local one. Itâs configured by the administrator; the users donât have to do anything special. Most of your post describes
how a user can directly submit individual jobs to an Access Point in a remote pool, which is a different (and usually inferior) process.
For each access point that should flock to another pool, you need to do two things:
1) Tell the schedd where it should flock
2) Give the schedd permission to join the remote pool
In the following example, letâs say you want the schedd at machine
submit.org1.edu to flock to the pool whose Central Manager is
cm.org2.edu.
For step 1, you set FLOCK_TO in the scheddâs configuration to name the collector of the remote pool. For example:
FLOCK_TO = cm.org2.edu
For step 2, the easiest thing to do is create an IDToken at
cm.org2.edu, give it to the flocking schedd, and add the IDTokenâs identity to the ADVERTISE_SCHEDD authorization list.
To create the IDToken, run this command:
condor_token_create -identity condor@xxxxxxxxxxxxxxx
Then, write the output of the command to a file in /etc/condor/tokens.d/ on the Access Point. This is a secret, so it should not be publicly readable (file should be owned by root with no group or world access permissions).
Finally, give the identity of the token permission to join the pool as an Access Point. Add the following line to the configuration files on
cm.org2.edu:
ALLOW_ADVERTISE_SCHEDD = $(ALLOW_ADVERTISE_SCHEDD)
condor@xxxxxxxxxxxxxxx
Once everything is done, do a condor_reconfig on both machines.
When the schedd at submit.org1.edu has jobs that can't be matched in its local pool (say because the pool is full running other jobs), it will start advertising to the collector at
cm.org2.edu and can start receiving matches for machines in that pool.
I hope thatâs enough information for you to get flocking working.
- Jaime
|