Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] gang scheduling multiple CPUs on an SMP machine
- Date: Tue, 11 Jan 2005 07:50:40 -0500 (EST)
- From: Hahn Kim <hgk@xxxxxxxxx>
- Subject: [Condor-users] gang scheduling multiple CPUs on an SMP machine
Hello,
We have a user running a multi-threaded MPI application, i.e. each rank
itself is multi-threaded. Our cluster consists of dual-Xeon SMP machines
and we set NUM_CPUS to 2 in Condor.
The problem is that the MPI application uses an Intel math library that
only allows a single process to use the library in a multi-threaded
manner. However, Condor often allocates two processors on the same
machine to two ranks. When threads from both ranks attempt to access the
library, the application fails.
I found several references to "gang-matching" being a potential feature
that could be added to Condor. For example, "Condor on Dedicated
Clusters" from Condor Week 2000 contains a slide titled "Future
Directions: Parallel Scheduling" that states the following:
----------------------------
> Co-scheduling of multiple hosts
* ...
* Other jobs might require co-scheduling
* A multi-threaded application might want to claim multiple CPUs on a
single SMP machine
* Requires "gang-matching"
----------------------------
A "gang-matching" capability as described in this slide would solve our
problem, by allocating both nodes on a processor to a single rank.
However, I cannot find any mention of it in the Condor documentation. Has
gang-matching been added to Condor or is there a way to obtain similar
behavior? Thanks.
Hahn