Subject: Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.
It was the firewall. Now i see them. It was the default firewall on Redhat Enterprise 5.1. I did not know that Redhat had oe built in and only thought I has the campus firewall to deal with.
Thanks
----- Original Message ----- From: Matthew Farrellee <matt@xxxxxxxxxx> Date: Friday, November 20, 2009 12:45 am Subject: Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine. To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> You should ensure you have connectivity to the ports Condor is > using. The Collector will be on 9618, which might be blocked by > a firewall. > > Best, > > > matt > > > Charles Embry wrote: > > I already have that set on the condor_confiq files of the > machines. > > 144.167.99.210 is the IP of the central manager Network > interface card thats connected. Its the only NIC connected on > the machine and it can open a web browser to the internet, > ssh and ping other machines on the same router. But in condor > the machines will not connect to each other. I run > condor_master on both machines and they can never connect. :( > > > > ----- Original Message ----- > > From: hailong.yang1115 <hailong.yang1115@xxxxxxxxx> > > Date: Thursday, November 19, 2009 9:09 pm > > Subject: Re: [Condor-users] Multiple Network Interface cards > and central managernot communicating with execute machine. > > To: Condor-Users Mail List <condor-users@xxxxxxxxxxx> > > > >> > > @font-face { font-family: 宋体; } @font- > face { font-family: Verdana; } @font-face { font- > family: @宋体; } @page Section1 {size: 595.3pt 841.9pt; margin: > 72.0pt 90.0pt 72.0pt 90.0pt; layout-grid: 15.6pt; } P.MsoNormal > { TEXT-JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; > MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: > 10.5pt } LI.MsoNormal { TEXT-JUSTIFY: inter-ideograph; > TEXT-ALIGN: justify; MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times > New Roman"; FONT-SIZE: 10.5pt } DIV.MsoNormal { TEXT- > JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; MARGIN: 0cm 0cm > 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: 10.5pt } A:link > { COLOR: blue; TEXT-DECORATION: underline } > SPAN.MsoHyperlink { COLOR: blue; TEXT-DECORATION: > underline } A:visited { COLOR: purple; TEXT-DECORATION: > underline } SPAN.MsoHyperlinkFollowed { COLOR: purple; > TEXT-DECORATION: underline } SPAN.EmailStyle17 { FONT- > STYLE: normal; FONT-FAMILY: Verdana; COLOR: windowtext; FONT- > WEIGHT: normal; TE > XT-DECORATION: none; mso-style-type: personal-compose } > DIV.Section1 { page: Section1 } UNKNOWN { FONT-SIZE: > 10pt } BLOCKQUOTE { MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; > MARGIN-LEFT: 2em } OL { MARGIN-TOP: 0px; MARGIN-BOTTOM: > 0px } UL { MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px } ---- > ------------------------------------------------------- > > | > > > > > >> Hi Charles, > > You can try to add the following: > > NETWORK_INTERFACE=your specific network interface > into > the configuration file to see if it works. > > Good > luck! > > -Hailong > > 2009-11-20 ------------- > ---------------------------------------------- > > > > > > ***********************************************>> * Hailong > Yang, PhD. Candidate > >> * Sino-German Joint Software Institute, > >> * School of Computer Science&Engineering, Beihang University > >> * Phone: (86-010)82315908 > >> * Email: hailong.yang1115@xxxxxxxxx > >> * Address: G413, New Main Building in Beihang > University, > >> > * No.37 XueYuan Road,HaiDian District, > >> > * Beijing,P.R.China,100191 > >> *********************************************** -------------- > --------------------------------------------- > > > 发件人: Charles Embry > 发送时间: 2009-11- > 20 05:29:53 > 收件人: condor-users > > 抄送: > 主题: [Condor-users] Multiple Network Interface > cards and central managernot communicating with execute > machine. > > > The condor pool that I am > trying to set up is on the same server rack/router and the > machines can ping each other and ssh each other. But in > condor they don;t seem to be communicating, condor_status > never shows the the execute machine that I am trying to > add to the central manager(that is also a submit and > execute machine) . The machines are all sunfire Sun > mirosystems servers. they all have 4 NICS, (Network Interface > cards) We are only using one(we have no need at this time > to use all of them) and the other three on each machine is > not hooked up to anything. > >> On the execute machine i get this error in the logs fie > > > >> Master log__________ > >> > >> 11/16 17:07:18 DaemonCore: Command Socket at > <144.167.99.201:49652>>> 11/16 17:07:18 Started DaemonCore > process "/root/Desktop/condor-7.2.4/sbin/condor_startd", > pid and pgroup = 27436 > >> 11/16 17:07:23 attempt to connect to > <144.167.99.210:9618> failed: No route to host (connect > errno = 113). Will keep trying for 20 total seconds (20 > to go). > >> > >> 11/16 17:07:44 attempt to connect to > <144.167.99.210:9618> failed: No route to host (connect > errno = 113). > >> > >> StartLog__________ > >> 11/19 15:48:58 slot1: State change: IS_OWNER is false > >> 11/19 15:48:58 slot1: Changing state: Owner -> Unclaimed > >> 11/19 15:49:23 attempt to connect to > <144.167.99.210:9618> failed: No route to host (connect > errno = 113). > >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP > auth session to <144.167.99.210:9618>, but it failed. > >> 11/19 15:49:23 Failed to start non-blocking update > to <144.167.99.210:9618>. > >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP > auth session to <144.167.99.210:9618>, but it failed. > >> 11/19 15:49:23 Failed to start non-blocking update > to <144.167.99.210:9618>. > >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP > auth session to <144.167.99.210:9618>, but it failed. > >> 11/19 15:49:23 Failed to start non-blocking update > to <144.167.99.210:9618>. > >> 11/19 15:49:23 ERROR: SECMAN:2004:Failed to create > security session to <144.167.99.210:9618> with > TCP.|SECMAN:2003:TCP connection to > <144.167.99.210:9618> failed. > > > > > >> The condor_collector Dameon is using the 9618 > socket on the central manager and thats the socket > on the central manager that the execute machine is trying to > connect to.. Why do the machines not connect in condor(No > route to host??) when they can ping and ssh each other? Do > i need to set something to make condor use the only > network interface that is connected,? Or is it the socket that > is being used by the collector on the central > manager? > > > > > > > > > >> Thanks for the much needed help. > > > > > > > > > >> > > > > > > > > | > > ----------------------------------------------------------- > > > _______________________________________________ > >> Condor-users mailing list > >> To unsubscribe, send a message to condor-users- > >> request@xxxxxxxxxxx with a > >> subject: Unsubscribe > >> You can also unsubscribe by visiting > >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users > >> > >> The archives can be found at: > >> https://lists.cs.wisc.edu/archive/condor-users/ > > > > > > > > > > --------------------------------------------------------------- > --------- > > > > _______________________________________________ > > Condor-users mailing list > > To unsubscribe, send a message to condor-users- > request@xxxxxxxxxxx with a > > subject: Unsubscribe > > You can also unsubscribe by visiting > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > > > The archives can be found at: > > https://lists.cs.wisc.edu/archive/condor-users/ > > _______________________________________________ > Condor-users mailing list > To unsubscribe, send a message to condor-users- > request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/condor-users/