Hi Mark, Thanks for letting me know the fix, I appreciate. I did add those three lines and updated mpd.conf file(MPD_PORT_RANGE=50001:59999). please find the modified mpd.py. Looks like I didn't properly added seems. I am getting this error. mpdboot_machine2 (handle_mpd_output 388): from mpd on machine1, invalid port info: mpdboot error : 255 Could you please let me know, is mpd.py has correct fix for port range. Thanks, Senthil -----Original Message----- From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark Calleja Sent: Tuesday, April 03, 2007 3:21 AM To: Condor-Users Mail List Subject: Re: [Condor-users] MPICH2 wrapper script (mpich2script) for parallel universe Hi Senthil, The fix for this was provided by Ralph Butler at MTSU. It involves editing mpd.py and adding three lines, so a diff between the new and original files gives (using v1.0.5p3 of MPICH2): 141d140 < 'MPD_PORT_RANGE' : 0, 150,151d148 < if self.parmdb['MPD_PORT_RANGE']: < os.environ['MPICH_PORT_RANGE'] = self.parmdb['MPD_PORT_RANGE'] After making this change, you will want to add the following in your ~/.mpd.conf file on all hosts: MPD_PORT_RANGE=50001:59999 This works in my tests. Regards, Mark Natarajan, Senthil wrote: > Hi Mark, > Thanks for your mp2script. > > I was wondering do you know how to set the port range for mpd to start > on other machines. > > In your script I added this, > export MPICH_PORT_RANGE=50001:59999 > > so the local mpd starts in the specified port range, but the mpd started > through mpdboot on remote machines are using random ports. How to start > the remote mpd also in the above specified port range. Because of the > random port number the firewall blocks the connection. > > Thanks, > Senthil > > > > -----Original Message----- > From: condor-users-bounces@xxxxxxxxxxx > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark Calleja > Sent: Wednesday, March 28, 2007 4:49 AM > To: Condor-Users Mail List > Subject: Re: [Condor-users] MPICH2 wrapper script (mpich2script) for > parallel universe > > Hi Nkwebi, > > I've modified my version of mp2script which now works with the "cpi" > example code that MPICH2 builds when run over multiple SMP machines. > Fancy testing it and reporting any feedback? One thing to note: there > seems to be a bug with the current version of MPICH2 (v1.0.5p3), at > least certainly when using the ch3:nemesis device, maybe even the > ch3:ssm. The fix requires changing the following line in mpiexec.py (at > around line 789): > > Change this line: > msgToMPD['ifhns'][loRange] = ifhn > to this: > if ifhn: msgToMPD['ifhns'][loRange] = ifhn > > Thanks to Ralph Butler at MTSU for pointing this out, without which I > couldn't get mp2script to work. > > Cheers, > Mark > > > Nkwebi Peace Motlogelwa wrote: > >> Thanks Mark for the script...I just tried it and it works fine if the >> parallel program is executed on a single dedicated node.. The script >> starts mpd on the master node (rank==0 or $_CONDOR_PROCNO ==0) only, >> and if one has many dedicated execute nodes, the script does not start >> > > >> mpd on the other nodes.. Will try to modify it to get it to start mpd >> on all dedicated execute nodes..so far tried using mpdboot, but it >> seems not as straight forward to get the ring of mpd's working.. >> >> regards.. >> >> On 3/16/07, *Mark Calleja* <M.Calleja@xxxxxxxxxxxxxxx >> <mailto:M.Calleja@xxxxxxxxxxxxxxx>> wrote: >> >> Hi Nkwebi, >> >> I don't know if you still need this, but you can get my copy of >> mp2script at: >> >> http://www.escience.cam.ac.uk/~mcal00/condor/mp2script.asc >> <http://www.escience.cam.ac.uk/%7Emcal00/condor/mp2script.asc> >> >> Copy and paste it, and rename it as mp2script. A couple of points >> > you > >> should bear in mind: I had to put a .mpd.conf file in the home >> directory >> of the user running condor (I use dedicated condor user accounts), >> but I >> also had to set the env var MPD_CONF_FILE in the script, otherwise >> > mpd > >> failed to find the file. I also load LD_LIBRARY_PATH with the >> > compiler > >> libs I used to build mpich2 (I used ifort/icc 9.1). This script >> > works > >> fine with the "cpi" example that gets built by mpich2 in >> /path/to/mpich2/distro/examples. >> >> Cheers, >> Mark >> >> Nkwebi Peace Motlogelwa wrote: >> > Hi all... I need a working MPICH2 wrapper script for condor's >> > parallel universe...I use condor-6.8.4, but it comes with >> > wrapper > >> > scripts for LAM and MPICH1 only.. I tried to modify the >> > mpich1script, but not winning so far... anybody using condor >> > and mpich2 and willing to share their wrapper scripts?...Pls >> > help.. > >> _______________________________________________ >> Condor-users mailing list >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx >> <mailto:condor-users-request@xxxxxxxxxxx> with a >> subject: Unsubscribe >> You can also unsubscribe by visiting >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users >> >> The archives can be found at either >> https://lists.cs.wisc.edu/archive/condor-users/ >> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR >> >> >> >> > ------------------------------------------------------------------------ > >> _______________________________________________ >> Condor-users mailing list >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx >> > with a > >> subject: Unsubscribe >> You can also unsubscribe by visiting >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users >> >> The archives can be found at either >> https://lists.cs.wisc.edu/archive/condor-users/ >> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR >> > > _______________________________________________ > Condor-users mailing list > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with > a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > The archives can be found at either > https://lists.cs.wisc.edu/archive/condor-users/ > http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR > > _______________________________________________ > Condor-users mailing list > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > The archives can be found at either > https://lists.cs.wisc.edu/archive/condor-users/ > http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR > _______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at either https://lists.cs.wisc.edu/archive/condor-users/ http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
Attachment:
mpd.py
Description: mpd.py