On Wed, Apr 21, 2010 at 11:50 AM, Santanu Das
<santanu@xxxxxxxxxxxxxxxxx <mailto:santanu@xxxxxxxxxxxxxxxxx>> wrote:
How powerful a machine (CPU-core, memory etc.) should be to run
condor central-manager for a ~400-core cluster? I'm interested in
vitalization, so I wanna get a rough idea about the hardware we
should go for to accommodate all the services/virtual machines to
run nicely.
Is this machine running condor_collector and condor_negotiator? Or is
it also running a condor_schedd daemon? If it's just condor_collector
and condor_negotiator: not very powerful at all really. I kept a 500
slot farm alive and well with an old 2-CPU (single core) Xeon box. It
had 4GB of RAM IIRC. Ran 32-bit CentOS4 on it. It did have Gb
fiber-based ethernet. It kept up just fine. The only reason it got
swapped out for new hardware was the hardware was well out of
warranty, which apparently makes IT department heads nervous.
Obviously this older hardware wouldn't work well with virtualization.
The CPUs aren't virtualization-friendly. But that does say you don't
need a lot of juice if you're just running condor_collector and
condor_negotiator.
Worth noting that my farm configuration was incredibly sensitive to
negotiation cycle times at the time. I was seeing under 2 minute
negotiation cycles with 1 schedd in the system holding 40k jobs (and a
fairly heterogeneous job distribution so not a lot of re-clustering
happening in the negotiator).
Also, is there any recommendation on vitalization platform as far
as condor is concerned? Is VMware ESXi a good choice??
I currently have a few farms with central managers virtualized on Xen.
The VMs are all running CentOS 5.somethingorother. It's not as stable
as I was hoping for. If there's any appreciable amount of disk latency
between the VM manager program and the disk where the image is based
Xen crashes and takes the images down with it. They restart, but it's
annoying. We were hosting the VM images on our NAS to make for easy,
nearly instant migration of VMs from machine to machine in case of
failure, but we get big latency spikes on our NAS thanks to load from
our farm and the spikes would take out Xen. Had to move 'em to local
disk for now which is very much less useful.
Hope that helps.
- Ian
------------------------------------------------------------------------
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/