Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] MPICH error using parallel universe
- Date: Thu, 7 Feb 2008 09:03:34 -0500
- From: Andrew Howard <ahoward@xxxxxxxxxx>
- Subject: [Condor-users] MPICH error using parallel universe
Hi everyone,
I'm working on getting MPI jobs running using the parallel universe,
but I seem to have hit a roadblock. Job execution is fine until it
reaches the mpirun command:
mpirun -machinefile machines -nolocal -v -np $_CONDOR_NPROCS
$EXECUTABLE $@
At this point, the job exits, and the outfile contains:
running /var/condor/execute/dir_22980/cpi on 1 LINUX ch_p4 processors
Could not find enough machines for architecture LINUX
This appears to be an MPICH error, but I can't figure out why it's
happening. I've been able to execute mpirun on each of the nodes
directly without a problem. Any suggestions on what to try next?
---
Andrew Howard
System Administrator
Rosen Center for Advanced Computing
Purdue University
ahoward@xxxxxxxxxxxxxxx