Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Getting closer with Parallel Universe on Dynamic slots
- Date: Fri, 25 Nov 2011 12:14:19 +0100
- From: Steffen Grunewald <Steffen.Grunewald@xxxxxxxxxx>
- Subject: [Condor-users] Getting closer with Parallel Universe on Dynamic slots
... but still no cigar.
The setup consists of 5 4-core machines and some more 2-cores machines.
All of them have been configured as single, partitionable slots.
Preemption is forbidden completely.
The rank definitions are as follows:
RANK = 0
NEGOTIATOR_PRE_JOB_RANK = 1000000000 + 1000000000 * (TARGET.JobUniverse =?= 11) * (TotalCpus+TotalSlots) - 1000 * Memory
I'd expect this to favour big machines over small ones (for Parallel jobs),
and partially occupied ones over empty ones.
What I see with the following submit file, is quite different:
universe = parallel
initialdir = /home/steffeng/tests/mpi/
executable = /home/steffeng/tests/mpi/mpitest
arguments = $(Process) $(NODE)
output = out.$(NODE)
error = err.$(NODE)
log = log
notification = Never
on_exit_remove = (ExitBySignal == False) || ((ExitBySignal == True) && (ExitSignal != 11))
should_transfer_files = yes
when_to_transfer_output = on_exit
Requirements = ( TotalCpus == 4 )
request_memory = 500
machine_count = 10
(mpitest is the ubiquitous "MPI hello world" program trying to get rank and
size from MPI_COMM_WORLD)
- if I leave the Requirements out, the 10 MPI nodes will end up on the big
5 machines (one per machine) plus 5 small ones
- with the Requirements set as above, each of the big machines will run
exactly two nodes instead of 4+4+2+0+0
- not all out.* and err.* files get written (the pattern looks semi-random)
- all of them identify as "rank 0" of "size 1"
Condor version is 7.6.0 (and should include the fixes of ticket 986 which
went into 7.5.6).
How can I debug this?
Cheers,
Steffen
--
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * --------------------------------- * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7274,fax:7298}