If I understand the Condor manual correctly, high availability for
submit machines requires that there is only one submission point. If I
have 4 submit machines, any of which can be submitting jobs via
different users, is it possible to set up HA such that if any one(or
more than one) of these SCHEDDs go down one of the other SCHEDDs can
pick up the jobs?
The config macro settings do not seem to lend themselves to support this
and therefore I am wondering if anyone can clarify whether HA for
SCHEDDs can support multiple submission points. I believe it would be a
limitation for us to have only one submit machine, because we are often
submitting a thousand or so jobs and the heap or memory could be a
limiting factor.
thanks for the help,
mike