I
have a cluster of 32 nodes connected by gigabit ethernet, each one has one
8 cores i7 CPU and two Tesla C2050 GPU card.
My jobs are
in 3 types:
a. IO
intensive
b. CPU
intensive
c. GPU
intensive
Most of them have hundreds of
input files and can be done in one day, except GPU jobs may be take two or
three days.
I think if I can combine
these jobs according their types to execute in one machine, it would be more
efficient. For example, if a job is CPU intensive(only one thread), it would be
better to dispatch it to a machine with most IO intensive and GPU intensive
jobs.
I can assign 3 float
values to each job to stand for IO, CPU and GPU intensity, and use Condor to
keep balance in cluster nodes. Is this idea meaningful? If so, can Condor do
this?
thanks!
Kyle
Qian |