Dear All, Specially Dr. Mark Calleja, for his mp2script, I am going to use some mp2scrip to running MPI on condor
pool. I have 2 quad-core, dedicated resource, called mpi0, and
mpi1, which the first is also my scheduler. moreover I don’t use any share
folder or …. However I can run parallel jobs, but my mpi simple jobs
doesn’t run. I use from Mark Calleja's mp2script , and I am not sure
whether am I using this file wrongly, or think there is an error in this script.
In particular at the end of my error files you can see: ___________________________________________________ + hostname=mpi0 + pwd + currentDir=/home/condor/execute/dir_6717 + whoami + user=condor + echo hellow.exe mpi0 4446 condor
/home/condor/execute/dir_6717 + /usr/local/condor/libexec/condor_chirp put -mode cwa -
/home/condor/spool/cluster41.proc0.subproc0/contact + [ 0 -ne 0 ] + [ hellow.exe -eq 0 ] [: 1: hellow.exe: bad number + EXECUTABLE=hellow.exe + shift + chmod +x hellow.exe + MPDIR=/usr/local/mpich2 + PATH=/usr/local/mpich2/bin:.:/usr/local/condor/bin:/sbin:/bin:/usr/sbin:/usr/bin + export PATH + export SCRATCH_LOC=loclocloc /home/condor/execute/dir_6717/condor_exec.exe: 39: cannot
create ~/loclocloc: Directory nonexistent + echo /home/condor/execute/dir_6717 + trap finalize TERM + [ hellow.exe -ne 0 ] [: 1: hellow.exe: bad number + [ hellow.exe -eq 0 ] [: 1: hellow.exe: bad number + exit 0 ___________________________________________________ I don’t know what is loclocloc and also I am
confusing about the meaning of [: 1: hellow.exe: bad number I attached all of the related files, such as my c program
and the correspond parts of all of my log files and etc. Regard, Arash P.S. you can find more detail in another subjects in this
mailing list with title: “mpich2 error " '.../condor_exec.exe' witharguments
hellow.exe: No such file or directory" |
Attachment:
simple_mpi.rar
Description: Binary data