One other question -
Is it valid to use
Requirements = (machine == "machine I am flocking to")
in my submit file to run a test of flocking? That may be my problem. If not is there some way to force a job to flock for the test?
Thanks for any help -
Don
FSU HPC
From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] on behalf of Shrum, Donald C [DCShrum@xxxxxxxxxxxxx]
Sent: Saturday, June 09, 2012 5:16 PM To: condor-users@xxxxxxxxxxx Subject: [Condor-users] flocking / CCB I'm trying to get a test job to flock between FSU and USF here in Florida.
As our cluster is on a private network and we have a public IP only on the central manager I added the following to condor_config on the central manager -
PRIVATE_NETWORK_NAME = fsu-hpc-condor-private
PRIVATE_NETWORK_INTERFACE = 10.178.6.5
I added CCB_ADDRESS and the same PRIVATE_NETWORK_NAME to the processing nodes' condor_config.
So far as I can tell the CCB daemon runs on the collector so I don't need to explicitly set it to run.
I must be missing something simple in the setup. I see errors that read -
06/09/12 16:39:05 CCBListener: failed to receive message from CCB server 10.178.6.5
I ran condor_reconfig on the processing nodes. Do I need to restart condor on all the nodes as a result of the change? The error message makes me think not.
Any pointers to debug this would be appreciated.
Thanks for the help.
Don
FSU HPC
|