Hi Collin,
Thanks for the reply. FS_REMOTE_DIR does not seem to be recognized. Using CLAIMTOBE does allow one workstation to submit it's job to the other workstation's schedd. I can now submit jobs from both workstations.
I am curious how condor_submit determines which schedd to send a job to. Or is that maybe the negotiator on the master? I think the preferable way (at least for us) would
be for condor_submit to send the job to the schedd on the workstation the job was sent from.
Thanks again!
Jim
I do not know how condor_submit determines which schedd to
>If the FS issue is that /tmp isn't available to the Schedd you can change On Wed, Jan 29, 2020 at 9:22 AM COULTER, JAMES A CTR USAF AFMC 96 SK/CCI
via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote: > Hi, > > > I have a requirement to configure HTCondor to submit jobs to Mac Sierra > workstations. So far I have installed a condor master running Master, > Collector, and Negotiator daemons on a RHEL 7 server. I have installed > htcondor on two Mac Sierra workstations both running startd and schedd > daemons. We created a condor user and home on our NFS file system that all > machines can access. > > > The problem I'm having is the workstations are both submitting jobs to the > same schedd. Sometimes its to workstation A, sometimes to workstation B. > If workstation A submits its job to B's schedd (and vice versa) I get an > authentication error. > > > I have tried several different authentication methods, but I can't get any > to work. If I leave SEC_DEFAULT_AUTHENTICATION = OPTIONAL, I get a > Kerberos authentication failed error. > > > Right now I am trying the pool password configuration I found in the FAQ: > https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToEnablePoolPassword > This setup results in FS authentication failing when workstation A submits > its job to schedd on workstation B. > > > Workstations A and B are using the same condor_config.local file. Here > are the contents: > > > CONDOR_HOST=master.example.com > DAEMON_LIST=MASTER STARTD SCHEDD > ALLOW_READ=* > ALLOW_WRITE=* > ALLOW_NEGOTIATOR = master.example.com > CONDOR_IDS = 3055.8186 > CONDOR_ADMIN = root@$(FULL_HOSTNAME) > ALL_DEBUG = D_FULLDEBUG > > # > # From > https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToEnablePoolPassword > # > SEC_PASSWORD_FILE = /etc/condor/condor_pool_password (NOTE: this file was > created on the master and copied to both clients, owner root, mode 0600) > SEC_DAEMON_INTEGRITY = REQUIRED > SEC_DAEMON_AUTHENTICATION = REQUIRED > SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD > SEC_NEGOTIATOR_INTEGRITY = REQUIRED > SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED > SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD > SEC_CLIENT_AUTHENTICATION_METHODS = FS,PASSWORD > ALLOW_DAEMON = condor_pool@* > > ----------------------------------------------------------------------------------- > > > > Here's the errors found in SchedLog after workstation A tries to submit a > job to schedd on workstation B: > > > 01/29/20 10:28:56 (pid:14019) DC_AUTHENTICATE: authentication of > <xxx.xxx.xxx.117:58966> did not result in a valid mapped user name, which > is required for this command (1112 QMGMT_WRITE_CMD), so aborting. > 01/29/20 10:28:56 (pid:14019) DC_AUTHENTICATE: reason for authentication > failure: AUTHENTICATE:1003:Failed to authenticate with any > method|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to > lstat(/tmp/FS_Ox5tu50VK) > > > --------------------------------------------------------------------------------- > > This is what failure on a client looks like: > > coulter@albatross ~/condor>/opt/condor/bin/condor_submit -debug sleep.sub > 01/29/20 10:28:59 Reading condor configuration from > '/etc/condor/condor_config' > 01/29/20 10:28:59 Enumerating interfaces: lo0 127.0.0.1 up > 01/29/20 10:28:59 Enumerating interfaces: lo0 ::1 up > 01/29/20 10:28:59 Enumerating interfaces: lo0 fe80::1 up > 01/29/20 10:28:59 Enumerating interfaces: en0 xxx.xxx.xxx.117 up > Submitting job(s)01/29/20 10:28:59 SharedPortClient: sent connection > request to schedd at <xxx.xxx.xxx.235:9618> for shared port id 1964_6748_6 > 01/29/20 10:28:59 SECMAN: required authentication with schedd at > <xxx.xxx.xxx.235:9618> failed, so aborting command QMGMT_WRITE_CMD. > > ERROR: Failed to connect to local queue manager > AUTHENTICATE:1003:Failed to authenticate with any method > AUTHENTICATE:1004:Failed to authenticate using FS > > > --------------------------------------------------------------------------------- > > The way I read this is FS authentication is attempting to read a file on > the schedd's local file system but because it isn't the submitter's local > file system it fails. I don't see anything at all about Password > authentication. I tried setting SEC_CLIENT_AUTHENTICATION_METHODS = > PASSWORD but that results in AUTHENTICATE:1003:Failed to authenticate > with any method. > > > Any suggestions on what I can do? My customer has a grand total of 20 Mac > Sierra workstations they want in the pool and we are on a dedicated network > so security is not as high on the priority list as getting this working. > > > Thanks, > > > Jim > > > > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with > a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/ -- *Collin Mehring *| PE-JoSE - Software Engineer -------------- next part --------- |