I have an interesting problem. I believe I detailed the setup of my
organization's cluster/network in an earlier post, but I will repeat it
here:
Nine compute machines (running STARTD, SCHEDD) exist on a private subnet
and cannot be reached (by design) from the rest of my organization. They
are connected to a switch which is also connected to the primary and
secondary central managers. The primary/secondary CMs have two NICs
each. Each CM has an IP on the private subnet and on my organization's
public network.
Problem: I tried submitting a job from my workstation (on my
organization's public network), but the CMs tell it to talk to the
STARTD at a private address, which obviously doesn't work. I told my
boss about this and asked how he wanted to proceed, and he wants to only
allow job submissions from the CMs. This works, BUT he also wants
failover capability. I don't foresee this working well since the submit
machine goes down when the CM goes down, even though CM functionality
fails over.
What kind of options do I have? My boss is adamant that the nodes stay
on a private subnet and that we have CM failover capability. I don't
think having a separate submit machine which straddles the private and
public networks (like the current CMs) will work. My boss wants no
single point of failure present in the system.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/