Hi, I have MPICH2 and condor 6.8 (setup to run parallel job) on
Linux. Suppose if we want to submit an mpi job to condor. In the job specification, if we specify the mpi executable
is enough? (Option 1) Or do we need to create some shell script wrapper that calls
the mpi executable with the mpiexec command, and specify that shell script as
the executable (Option 2) Option 1 ******* Universe = parallel executable = cpi output = cpi$(NODE).out error = cpi$(NODE).error Log = cpi.log machine_count = 4 should_transfer_files = yes when_to_transfer_output = on_exit queue Option 2 ******* Universe = parallel executable = jobfile.sh output = cpi$(NODE).out error = cpi$(NODE).error Log = cpi.log machine_count = 4 should_transfer_files = yes when_to_transfer_output = on_exit queue jobfile.sh ******** #!/bin/sh mpiexec -np 2 cpi When I ran using Option 1, job ran only on couple of nodes
and become idle. With this error in the log file “UserPolicy Error: No
signal/exit codes in job ad!” When I ran using Option 2, job fails and complaining about
mpd.conf file is not available, though this file in the path and even I tried
to attach with job but nothing worked. Could you please let me know how to submit parallel jobs to
condor which uses MPICH2. Thanks, Senthil |