Hello,
I
am trying to run a
this Pegasus workflow for an
experiment I am running. In order to run the workflow, I was
trying to create a multi-machine condor pool using the
instructions in the documentation from
here. Whenever I run through the commands on the webpage
and get to the point where I run
condor_status
on
the submit node. I am getting the following error.
Error:
communication error
SECMAN:2007:Failed
to end classad message.
I am very
new to HTCondor so any advice to help me get my multi machine
pool running would be greatly appreciated.
I am
creating this multi-machine pool using cloud lab. Each node is
a
m510 machine running ubuntu 22.04.02
LTS. The machines are all connected to the same network and each
node has a hostname node{num}. I made node0 the central manager,
node1 the submit node, and node2/node3 execute nodes. The
commands I ran to create the
multi-machine
pool were:
$ curl -fsSL
https://get.htcondor.org | sudo
GET_HTCONDOR_PASSWORD="$htcondor_password" /bin/bash -s --
--no-dry-run --central-manager node0
$ curl -fsSL
https://get.htcondor.org | sudo
GET_HTCONDOR_PASSWORD="$htcondor_password" /bin/bash -s --
--no-dry-run --submit node0
$curl -fsSL
https://get.htcondor.org | sudo
GET_HTCONDOR_PASSWORD="$htcondor_password" /bin/bash -s --
--no-dry-run --execute node0
Thanks,
Vijay