Hi Mark,
Thanks for letting me know the fix, I appreciate.
I did add those three lines and updated mpd.conf
file(MPD_PORT_RANGE=50001:59999). please find the modified mpd.py. Looks
like I didn't properly added seems. I am getting this error.
mpdboot_machine2 (handle_mpd_output 388): from mpd on machine1, invalid
port info:
mpdboot error : 255
Could you please let me know, is mpd.py has correct fix for port range.
Thanks,
Senthil
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark Calleja
Sent: Tuesday, April 03, 2007 3:21 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] MPICH2 wrapper script (mpich2script) for
parallel universe
Hi Senthil,
The fix for this was provided by Ralph Butler at MTSU. It involves
editing mpd.py and adding three lines, so a diff between the new and
original files gives (using v1.0.5p3 of MPICH2):
141d140
< 'MPD_PORT_RANGE' : 0,
150,151d148
< if self.parmdb['MPD_PORT_RANGE']:
< os.environ['MPICH_PORT_RANGE'] =
self.parmdb['MPD_PORT_RANGE']
After making this change, you will want to add the following in your
~/.mpd.conf file on all hosts:
MPD_PORT_RANGE=50001:59999
This works in my tests.
Regards,
Mark
Natarajan, Senthil wrote:
Hi Mark,
Thanks for your mp2script.
I was wondering do you know how to set the port range for mpd to start
on other machines.
In your script I added this,
export MPICH_PORT_RANGE=50001:59999
so the local mpd starts in the specified port range, but the mpd
started
through mpdboot on remote machines are using random ports. How to
start
the remote mpd also in the above specified port range. Because of the
random port number the firewall blocks the connection.
Thanks,
Senthil
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark Calleja
Sent: Wednesday, March 28, 2007 4:49 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] MPICH2 wrapper script (mpich2script) for
parallel universe
Hi Nkwebi,
I've modified my version of mp2script which now works with the "cpi"
example code that MPICH2 builds when run over multiple SMP machines.
Fancy testing it and reporting any feedback? One thing to note: there
seems to be a bug with the current version of MPICH2 (v1.0.5p3), at
least certainly when using the ch3:nemesis device, maybe even the
ch3:ssm. The fix requires changing the following line in mpiexec.py
(at
around line 789):
Change this line:
msgToMPD['ifhns'][loRange] = ifhn
to this:
if ifhn: msgToMPD['ifhns'][loRange] = ifhn
Thanks to Ralph Butler at MTSU for pointing this out, without which I
couldn't get mp2script to work.
Cheers,
Mark
Nkwebi Peace Motlogelwa wrote:
Thanks Mark for the script...I just tried it and it works fine if the
parallel program is executed on a single dedicated node.. The script
starts mpd on the master node (rank==0 or $_CONDOR_PROCNO ==0) only,
and if one has many dedicated execute nodes, the script does not
start
mpd on the other nodes.. Will try to modify it to get it to start mpd
on all dedicated execute nodes..so far tried using mpdboot, but it
seems not as straight forward to get the ring of mpd's working..
regards..
On 3/16/07, *Mark Calleja* <M.Calleja@xxxxxxxxxxxxxxx
<mailto:M.Calleja@xxxxxxxxxxxxxxx>> wrote:
Hi Nkwebi,
I don't know if you still need this, but you can get my copy of
mp2script at:
http://www.escience.cam.ac.uk/~mcal00/condor/mp2script.asc
<http://www.escience.cam.ac.uk/%7Emcal00/condor/mp2script.asc>
Copy and paste it, and rename it as mp2script. A couple of points
you
should bear in mind: I had to put a .mpd.conf file in the home
directory
of the user running condor (I use dedicated condor user
accounts),
but I
also had to set the env var MPD_CONF_FILE in the script,
otherwise
mpd
failed to find the file. I also load LD_LIBRARY_PATH with the
compiler
libs I used to build mpich2 (I used ifort/icc 9.1). This script
works
fine with the "cpi" example that gets built by mpich2 in
/path/to/mpich2/distro/examples.
Cheers,
Mark
Nkwebi Peace Motlogelwa wrote:
> Hi all... I need a working MPICH2 wrapper script for condor's
> parallel universe...I use condor-6.8.4, but it comes with
wrapper
> scripts for LAM and MPICH1 only.. I tried to modify the
> mpich1script, but not winning so far... anybody using condor
> and mpich2 and willing to share their wrapper scripts?...Pls
help..
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to
condor-users-request@xxxxxxxxxxx
<mailto:condor-users-request@xxxxxxxxxxx> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
------------------------------------------------------------------------
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
------------------------------------------------------------------------
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR