Hi all,
This is more of a general than technical question. We are
investigating and planning the deployment of Condor to about 8,000 to
10,000 machines in a mixed Linux/Windows environment.
The initial roll out will be to a number of Linux clusters to test the
technology, but eventually roll out will occur across several
departments. I'm interested in how people have scaled up to handle
clusters in this range, my experience is limited to single submit
host/40 machines.. so this is a bit of a jump.
In terms of architechture what should we be thinking about right now?
Would people suggest a number of flocked condor pools each with a
submit host? Would a single submit host fall over trying to handle
the needs of a potentially large number of on-site users?
We're also interested in access control and authentication setups on
such a large scale. (Current authentication is via AD which the Linux
clusters also authenticate against). The ability for certain
departments to have priority over their own hardware would be nice, or
limiting users to certain pools etc.
I really would be grateful for any pointers to documentation, contact
with administrators of other large Condor deployments, or just general
thoughts on setup strategy. I know this is a wide open question, and
all feedback would be gratefully received.
regards,
Dan
--
Dan Swan || dan.swan[at]gmail.com || http://scot1and.net/~dan
"Reality is that which, when you stop believing in it, doesn't go away."
(Philip K. Dick - How to Build a Universe)
|