Hi Yifei Li, can you actually reach the slurm node from your submit machine?AFAIS the IP is within the private network 172.16.0.0/12 block. Might be that you are trying to reach the SLURM node on its private interface from outside?
Cheers, Thomas On 22/08/2023 15.36, Yifei Li wrote:
Thanks for your reply! However this command is not useful. I am reading the source code of condor_remote_cluster. If you have any other suggestion about how to specify the port. Please let me know the news.(base) *liyifei@ubuntu*:*~*$ condor_remote_cluster -a cse12232396@xxxxxxxxxxxx:10022 slurmEnter the password to copy the ssh keys to cse12232396@xxxxxxxxxxxx:10022:ssh: Could not resolve hostname 172.18.34.19:10022: Name or service not knownYifei Li ------------------ÂOriginalÂ------------------ *From: *Â"JaimeÂFreyÂviaÂHTCondor-users"<htcondor-users@xxxxxxxxxxx>; *Date: *ÂTue, Aug 22, 2023 09:22 PM *To: *Â"htcondor-users"<htcondor-users@xxxxxxxxxxx>; *Cc: *Â"Jaime Frey"<jfrey@xxxxxxxxxxx>;*Subject: *ÂRe: [HTCondor-users] Remote cluster test failed when using condor_remote_cluster commandIÂdonâtÂknowÂwhyÂtheÂalternateÂportÂnumberÂinÂ~/.ssh/configÂwouldÂworkÂwithÂ--addÂbutÂnotÂ--test.ÂYouÂcanÂincludeÂtheÂalternateÂportÂnumberÂinÂtheÂhostname,ÂlikeÂso: condor_remote_clusterÂ-aÂcse12232396@xxxxxxxxxxxx:12345Âslurm condor_remote_clusterÂ-aÂcse12232396@xxxxxxxxxxxx:12345 grid_resourceÂ=ÂbatchÂslurmÂcse12232396@xxxxxxxxxxxx:12345 Â-ÂJaime >ÂOnÂAugÂ22,Â2023,ÂatÂ6:50ÂAM,ÂYifeiÂLiÂ<12232396@xxxxxxxxxxxxxxxxxxx>Âwrote: > >ÂThankÂyouÂsoÂmuch!! >ÂIÂhaveÂinstalledÂremoteÂclusterÂsuccessfully.ÂHoweverÂourÂremoteÂcluster'sÂsshÂportÂisÂnotÂ22.ÂHowÂcanÂiÂsetÂtheÂsshÂportÂforÂremoteÂcluster?ÂÂIÂhaveÂaddedÂtheÂremoteÂsshÂinfoÂintoÂ~/.ssh/config,ÂwhichÂmeansÂiÂcanÂuseÂsshÂcommandÂwithoutÂspecifyÂport.ÂHoweverÂitÂdoesÂnotÂworkÂwhenÂusingÂcondor_remote_clusterÂ-t(itÂworksÂforÂcondor_remote_clusterÂ-add).ÂHereÂisÂtheÂlogÂshowingÂfailedÂtest. > >Â(base)Âliyifei@ubuntu:~$Âcondor_remote_clusterÂ-tÂcse12232396@xxxxxxxxxxxx >ÂTestingÂsshÂtoÂcse12232396@xxxxxxxxxxxxxxxxxxxxx! >ÂTestingÂremoteÂsubmission...Passed! >ÂSubmissionÂandÂlogÂfilesÂforÂthisÂjobÂareÂinÂ/home/liyifei/bosco-test/boscotest.aYtWz >ÂWaitingÂforÂjobmanagerÂtoÂacceptÂjob...Passed >ÂCheckingÂforÂsubmissionÂtoÂremoteÂslurmÂclusterÂ(couldÂtakeÂ~30Âseconds)...Failed >ÂShowingÂlastÂ5ÂlinesÂofÂlogs: >Â08/22/23Â19:39:27Â[1938276]ÂErrorÂstartingÂ172.18.34.19ÂGAHP:ÂAgentÂpidÂ1938292\nssh:ÂconnectÂtoÂhostÂ172.18.34.19ÂportÂ22:ÂConnectionÂrefused\nAgentÂpidÂ1938292Âkilled\n >Â08/22/23Â19:39:27Â[1938276]ÂresourceÂcse12232396@xxxxxxxxxxxxÂisÂnowÂdown >Â08/22/23Â19:39:27Â[1938276]Â(6.0)ÂdoEvaluateStateÂcalled:ÂgmStateÂGM_INIT,ÂremoteStateÂ0 >Â08/22/23Â19:39:27Â[1938276]ÂGahpÂServerÂ(pid=1938286)ÂexitedÂwithÂstatusÂ255Âunexpectedly >Â08/22/23Â19:39:31Â[1938276]Â(6.0)ÂdoEvaluateStateÂcalled:ÂgmStateÂGM_CLEAR_REQUEST,ÂremoteStateÂ0 > >ÂYifeiÂLi > > > > >ÂÂÂÂ------------------ÂOriginalÂ------------------ >ÂFrom:ÂÂ"TimÂTheisen"<tim@xxxxxxxxxxx>; >ÂDate:ÂÂTue,ÂAugÂ22,Â2023Â07:26ÂPM >ÂTo:ÂÂ"htcondor-users"<htcondor-users@xxxxxxxxxxx>;Â"YifeiÂLi"<12232396@xxxxxxxxxxxxxxxxxxx>;ÂSubject:ÂÂRe:Â[HTCondor-users]ÂRemoteÂclusterÂtestÂfailedÂwhenÂusingÂcondor_remote_clusterÂcommand >ÂÂWhenÂIÂcheckedÂthisÂmorning,ÂtheÂfileÂserverÂisÂbackÂonline. >Â...Tim >ÂOnÂ8/21/23Â22:19,ÂTimÂTheisenÂviaÂHTCondor-usersÂwrote: >>ÂIÂhaveÂconfirmedÂthatÂfileÂserverÂisÂcurrentlyÂnotÂavailable.ÂIÂwillÂreportÂbackÂwhenÂitÂisÂoperational. >>Â...Tim >>ÂOnÂ8/21/23Â20:45,ÂYifeiÂLiÂwrote: >>>ÂThanksÂforÂyourÂreply! >>>ÂIÂamÂtryingÂtoÂuseÂcondor_remote_clusterÂunderÂaÂregularÂaccount.ÂButÂitÂseemsÂthatÂthereÂisÂnetworkÂerrorÂduringÂdownloadingÂinstallationÂfile.ÂIsÂtheÂfileÂserverÂshutdown?ÂIÂdownloadedÂitÂsuccessfullyÂseveralÂdaysÂago.ÂCouldÂyouÂcheckÂitÂforÂme?ÂThankÂyou! >>> >>>Â***Log*** >>>Âliyifei@ubuntu:~$Âcondor_remote_clusterÂ--addÂcse12232396@1****Âslurm >>>ÂEnterÂtheÂpasswordÂtoÂcopyÂtheÂsshÂkeysÂtoÂcse12232396@xxxxxxxxxxxx: >>>ÂDownloadingÂreleaseÂbuildÂforÂcse12232396@****..............................................................................................................................curl:Â(28)ÂFailedÂtoÂconnectÂtoÂresearch.cs.wisc.eduÂportÂ443:ÂConnectionÂtimedÂout >>>ÂFailure >>>ÂFailedÂtoÂdownloadÂreleaseÂbuild. >>>ÂUnableÂtoÂdownloadÂandÂprepareÂfilesÂforÂremoteÂinstallation. >>>ÂDownloadÂURL:Âhttps://research.cs.wisc.edu/htcondor/tarball/10.x/10.7.0/release/condor-10.7.0-x86_64_AlmaLinux8-stripped.tar.gz >>>ÂAbortingÂinstallationÂtoÂcse12232396@***. >>> >>>ÂYifeiÂLi >>> >>> >>> >>> >>>ÂÂÂÂÂÂÂ------------------ÂOriginalÂ------------------ >>>ÂFrom:ÂÂ"JaimeÂFreyÂviaÂHTCondor-users"<htcondor-users@xxxxxxxxxxx>; >>>ÂDate:ÂÂTue,ÂAugÂ22,Â2023Â05:24ÂAM >>>ÂTo:ÂÂ"htcondor-users"<htcondor-users@xxxxxxxxxxx>; >>>ÂCc:ÂÂ"JaimeÂFrey"<jfrey@xxxxxxxxxxx>; >>>ÂSubject:ÂÂRe:Â[HTCondor-users]ÂRemoteÂclusterÂtestÂfailedÂwhenÂusingÂcondor_remote_clusterÂcommand >>>ÂÂÂTheÂcondor_remote_clusterÂcommandÂhasÂtoÂbeÂrunÂunderÂtheÂregularÂuserÂaccountÂunderÂwhichÂyouÂwillÂbeÂsubmittingÂyourÂworkflowÂjobs.ÂYouÂdonâtÂrunÂitÂasÂtheÂrootÂuser. >>> >>>ÂYouÂcanÂuseÂcondor_remote_clusterÂtoÂaccessÂtwoÂdifferentÂclustersÂsimultaneouslyÂforÂyourÂworkflows.ÂOneÂthingÂtoÂkeepÂinÂmindÂisÂthatÂeachÂsubmitÂfileÂmustÂnameÂtheÂclusterÂthatÂthatÂjobÂshouldÂbeÂrunÂon,ÂlikeÂso: >>> >>>Âgrid_resoruceÂ=ÂbatchÂslurmÂcluster1.foo.edu >>> >>>ÂIfÂyouâreÂusingÂDAGMan,ÂyouÂcanÂuseÂtheÂVARSÂcommandÂtoÂsetÂtheÂclusterÂtoÂuseÂforÂaÂwholeÂsetÂofÂnodesÂinÂtheÂDAG. >>> >>>ÂÂ-ÂJaime >>>>>>>ÂOnÂAugÂ19,Â2023,ÂatÂ2:22ÂAM,Âæé éÂ<12232396@xxxxxxxxxxxxxxxxxxx>Âwrote:>>>> >>>>ÂDearÂHTCondorÂdevelopmentÂTeam, >>>>ÂÂÂÂÂIÂcanÂaccessÂtwoÂcampusÂclusters,ÂwhichÂoneÂisÂLSFÂbased,ÂtheÂotherÂisÂSlurmÂbased.ÂSinceÂiÂamÂnotÂaÂadministratorÂofÂtheseÂclusterÂandÂiÂstillÂwantÂtoÂuseÂthemÂtoÂexecuteÂoneÂworkflowÂsimultaneously,ÂIÂthinkÂiÂcanÂuseÂcondor_remote_clusterÂtoÂachieveÂmyÂgoal.ÂFirstÂquestion:ÂCanÂIÂutilizeÂtheÂtwoÂclusterÂbyÂHTCondorÂtoÂexecuteÂaÂworkflowÂsimultaneously? >>>>ÂÂÂÂÂUntilÂnow,ÂIÂhaveÂdoneÂsomeÂeffortÂtoÂachieveÂmyÂgoal.ÂIÂinstalledÂHTCondor(MiniCondor)ÂonÂmyÂPCÂworkstationÂinÂtheÂsameÂlocalÂareaÂnetworkÂofÂcampusÂclusters.ÂIÂtriedÂtoÂuseÂcondor_remote_clusterÂcommandÂtoÂaddÂtheÂLSFÂclusterÂandÂSlurmÂcluster.ÂIÂaddedÂthemÂsuccessfullyÂandÂtheyÂareÂshownÂinÂtheÂremoteÂclusterÂlist.ÂHowever,ÂwhenÂIÂtryÂtoÂtestÂusingÂ"condor_remote_clusterÂ-t"Âcommand.ÂTheÂtaskÂcan'tÂbeÂdispatchedÂtoÂtheÂremoteÂcluster.ÂThereÂwillÂbeÂanÂidleÂtaskÂinÂtheÂcondor_q. >>>>ÂCouldÂyouÂprovideÂsomeÂsuggestionsÂtoÂhelpÂmeÂsetÂupÂmyÂenvironment?ÂIsÂitÂpossibleÂforÂmeÂtoÂachieveÂmyÂgoalsÂwithoutÂrootÂaccessÂofÂcluster?ÂLookingÂforwardÂtoÂyourÂreply. >>>> >>>>Â****LogÂfromÂmyÂPCÂworkstation**** >>>>Âroot@ubuntu:~/bosco-test/boscotest.p3SGb#Âcondor_remote_clusterÂ-tÂcse-liyf@xxxxxxxxxxxx >>>>ÂTestingÂsshÂtoÂcse-liyf@xxxxxxxxxxxxxxxxxxxxx! >>>>ÂTestingÂremoteÂsubmission...Passed! >>>>ÂSubmissionÂandÂlogÂfilesÂforÂthisÂjobÂareÂinÂ/root/bosco-test/boscotest.2DBlK >>>>ÂWaitingÂforÂjobmanagerÂtoÂacceptÂjob...Passed >>>>ÂCheckingÂforÂsubmissionÂtoÂremoteÂlsfÂclusterÂ(couldÂtakeÂ~30Âseconds)...grep:Â/root/bosco-test/boscotest.2DBlK/logfile:ÂNoÂsuchÂfileÂorÂdirectory >>>>Âgrep:Â/root/bosco-test/boscotest.2DBlK/logfile:ÂNoÂsuchÂfileÂorÂdirectory >>>>Âgrep:Â/root/bosco-test/boscotest.2DBlK/logfile:ÂNoÂsuchÂfileÂorÂdirectory >>>>Âgrep:Â/root/bosco-test/boscotest.2DBlK/logfile:ÂNoÂsuchÂfileÂorÂdirectory >>>>Âgrep:Â/root/bosco-test/boscotest.2DBlK/logfile:ÂNoÂsuchÂfileÂorÂdirectory >>>>ÂThenÂfailed. >>>> >>>> >>>>ÂYifeiÂLi >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>ÂÂÂ_______________________________________________ >>>>ÂHTCondor-usersÂmailingÂlist >>>>ÂToÂunsubscribe,ÂsendÂaÂmessageÂtoÂhtcondor-users-request@xxxxxxxxxxxÂwithÂa >>>>Âsubject:ÂUnsubscribe >>>>ÂYouÂcanÂalsoÂunsubscribeÂbyÂvisiting >>>>Âhttps://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >>>> >>>>ÂTheÂarchivesÂcanÂbeÂfoundÂat: >>>>Âhttps://lists.cs.wisc.edu/archive/htcondor-users/ >>> >>> >>>Â_______________________________________________ >>>ÂHTCondor-usersÂmailingÂlist >>>ÂToÂunsubscribe,ÂsendÂaÂmessageÂtoÂhtcondor-users-request@xxxxxxxxxxxÂwithÂa >>>Âsubject:ÂUnsubscribe >>>ÂYouÂcanÂalsoÂunsubscribeÂbyÂvisiting >>>Âhttps://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >>> >>>ÂTheÂarchivesÂcanÂbeÂfoundÂat: >>>Âhttps://lists.cs.wisc.edu/archive/htcondor-users/ >>Â-- >>ÂTimÂTheisenÂ(he,Âhim,Âhis) >>ÂReleaseÂManager >>ÂHTCondorÂ&ÂOpenÂScienceÂGrid >>ÂCenterÂforÂHighÂThroughputÂComputing >>ÂDepartmentÂofÂComputerÂSciences >>ÂUniversityÂofÂWisconsinÂ-ÂMadison >>Â4261ÂComputerÂSciencesÂandÂStatistics >>Â1210ÂWÂDaytonÂSt >>ÂMadison,ÂWIÂ53706-1685 >>Â+1Â608Â265Â5736 >> >>Â_______________________________________________ >>ÂHTCondor-usersÂmailingÂlist >>ÂToÂunsubscribe,ÂsendÂaÂmessageÂtoÂhtcondor-users-request@xxxxxxxxxxxÂwithÂa >>Âsubject:ÂUnsubscribe >>ÂYouÂcanÂalsoÂunsubscribeÂbyÂvisiting >>Âhttps://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> >>ÂTheÂarchivesÂcanÂbeÂfoundÂat: >>Âhttps://lists.cs.wisc.edu/archive/htcondor-users/ >Â-- >ÂTimÂTheisenÂ(he,Âhim,Âhis) >ÂReleaseÂManager >ÂHTCondorÂ&ÂOpenÂScienceÂGrid >ÂCenterÂforÂHighÂThroughputÂComputing >ÂDepartmentÂofÂComputerÂSciences >ÂUniversityÂofÂWisconsinÂ-ÂMadison >Â4261ÂComputerÂSciencesÂandÂStatistics >Â1210ÂWÂDaytonÂSt >ÂMadison,ÂWIÂ53706-1685 >Â+1Â608Â265Â5736 >Â_______________________________________________ >ÂHTCondor-usersÂmailingÂlist >ÂToÂunsubscribe,ÂsendÂaÂmessageÂtoÂhtcondor-users-request@xxxxxxxxxxxÂwithÂa >Âsubject:ÂUnsubscribe >ÂYouÂcanÂalsoÂunsubscribeÂbyÂvisiting >Âhttps://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > >ÂTheÂarchivesÂcanÂbeÂfoundÂat: >Âhttps://lists.cs.wisc.edu/archive/htcondor-users/ _______________________________________________ HTCondor-usersÂmailingÂlist ToÂunsubscribe,ÂsendÂaÂmessageÂtoÂhtcondor-users-request@xxxxxxxxxxxÂwithÂa subject:ÂUnsubscribe YouÂcanÂalsoÂunsubscribeÂbyÂvisiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users TheÂarchivesÂcanÂbeÂfoundÂat: https://lists.cs.wisc.edu/archive/htcondor-users/ _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature