[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] MPI problem
- Date: Fri, 10 Feb 2006 17:53:07 -0500
- From: Bruno Goncalves <bgoncalves@xxxxxxxxx>
- Subject: [Condor-users] MPI problem
Hi all,
I'm trying to get lammpi working under condor, but I'm running into some issues. It's a simple MPI hello world. It runs fine under lam:
[bgoncalves@underdark temp]$ lamboot hostfile.txt
LAM 7.1.1
/MPI 2 C++/ROMIO - Indiana University
[bgoncalves@underdark temp]$ mpirun -np 5 ./hello.x
Hello World! I am 0 of 5
Hello World! I am 2 of 5
Hello World! I am 4 of 5
Hello World! I am 1 of 5
Hello World! I am 3 of 5
[bgoncalves@underdark temp]$ lamhalt
LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
[bgoncalves@underdark temp]$
but when I submit it to condor using:
universe = parallel
executable = lamscript
arguments = /home/bgoncalves/progs/temp/hello.x
Output = paralle.out.$(CLUSTER).$(NODE)
machine_count=5
queue 1
all I get on the "Output" files is:
error 0 chirp putting identity keys back
and condor email says:
Here are the machines that ran your MPI job.
They are listed in the order they were started
in, which is the same as MPI_Comm_rank.
Machine Name Result
------------------------ -----------
pumpkin110.physics.emory.edu exited normally with status 255
pumpkin108.physics.emory.edu was removed by the user
pumpkin207.physics.emory.edu was removed by the user
pumpkin109.physics.emory.edu exited normally with status 255
pumpkin205.physics.emory.edu was removed by the user
Have a nice day.
What am I doing wrong?
Thanks!
Bruno
--
*******************************************
Bruno Miguel Tavares Goncalves, MS
PhD Candidate
Emory University
Department of Physics
Office No. N117-C
400 Dowman Drive
Atlanta, Georgia 30322
Homepage: www.bgoncalves.com
Email: bgoncalves@xxxxxxxxx
Phone: (404) 712-2441
Fax: (404) 727-0873
*******************************************