[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.
- Date: Fri, 20 Nov 2009 19:53:39 -0600
- From: Charles Embry <csembry@xxxxxxxx>
- Subject: Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.
It was the firewall. Now i see them. It was the default firewall on Redhat Enterprise 5.1. I did not know that Redhat had oe built in and only thought I has the campus firewall to deal with.
Thanks
----- Original Message -----
From: Matthew Farrellee <matt@xxxxxxxxxx>
Date: Friday, November 20, 2009 12:45 am
Subject: Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.
To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> You should ensure you have connectivity to the ports Condor is
> using. The Collector will be on 9618, which might be blocked by
> a firewall.
>
> Best,
>
>
> matt
>
>
> Charles Embry wrote:
> > I already have that set on the condor_confiq files of the
> machines.
> > 144.167.99.210 is the IP of the central manager Network
> interface card thats connected. Its the only NIC connected on
> the machine and it can open a web browser to the internet,
> ssh and ping other machines on the same router. But in condor
> the machines will not connect to each other. I run
> condor_master on both machines and they can never connect. :(
> >
> > ----- Original Message -----
> > From: hailong.yang1115 <hailong.yang1115@xxxxxxxxx>
> > Date: Thursday, November 19, 2009 9:09 pm
> > Subject: Re: [Condor-users] Multiple Network Interface cards
> and central managernot communicating with execute machine.
> > To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> >
> >>
> > @font-face { font-family: 宋体; } @font-
> face { font-family: Verdana; } @font-face { font-
> family: @宋体; } @page Section1 {size: 595.3pt 841.9pt; margin:
> 72.0pt 90.0pt 72.0pt 90.0pt; layout-grid: 15.6pt; } P.MsoNormal
> { TEXT-JUSTIFY: inter-ideograph; TEXT-ALIGN: justify;
> MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE:
> 10.5pt } LI.MsoNormal { TEXT-JUSTIFY: inter-ideograph;
> TEXT-ALIGN: justify; MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times
> New Roman"; FONT-SIZE: 10.5pt } DIV.MsoNormal { TEXT-
> JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; MARGIN: 0cm 0cm
> 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: 10.5pt } A:link
> { COLOR: blue; TEXT-DECORATION: underline }
> SPAN.MsoHyperlink { COLOR: blue; TEXT-DECORATION:
> underline } A:visited { COLOR: purple; TEXT-DECORATION:
> underline } SPAN.MsoHyperlinkFollowed { COLOR: purple;
> TEXT-DECORATION: underline } SPAN.EmailStyle17 { FONT-
> STYLE: normal; FONT-FAMILY: Verdana; COLOR: windowtext; FONT-
> WEIGHT: normal; TE
> XT-DECORATION: none; mso-style-type: personal-compose }
> DIV.Section1 { page: Section1 } UNKNOWN { FONT-SIZE:
> 10pt } BLOCKQUOTE { MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px;
> MARGIN-LEFT: 2em } OL { MARGIN-TOP: 0px; MARGIN-BOTTOM:
> 0px } UL { MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px } ----
> -------------------------------------------------------
> > |
> >
> >
> >> Hi Charles, > > You can try to add the following: >
> NETWORK_INTERFACE=your specific network interface > into
> the configuration file to see if it works. > > Good
> luck! > > -Hailong > > 2009-11-20 -------------
> ----------------------------------------------
> > > > >
> ***********************************************>> * Hailong
> Yang, PhD. Candidate
> >> * Sino-German Joint Software Institute,
> >> * School of Computer Science&Engineering, Beihang University
> >> * Phone: (86-010)82315908
> >> * Email: hailong.yang1115@xxxxxxxxx
> >> * Address: G413, New Main Building in Beihang
> University,
> >>
> * No.37 XueYuan Road,HaiDian District,
> >>
> * Beijing,P.R.China,100191
> >> *********************************************** --------------
> ---------------------------------------------
> > > 发件人: Charles Embry > 发送时间: 2009-11-
> 20 05:29:53 > 收件人: condor-users >
> 抄送: > 主题: [Condor-users] Multiple Network Interface
> cards and central managernot communicating with execute
> machine. > > > The condor pool that I am
> trying to set up is on the same server rack/router and the
> machines can ping each other and ssh each other. But in
> condor they don;t seem to be communicating, condor_status
> never shows the the execute machine that I am trying to
> add to the central manager(that is also a submit and
> execute machine) . The machines are all sunfire Sun
> mirosystems servers. they all have 4 NICS, (Network Interface
> cards) We are only using one(we have no need at this time
> to use all of them) and the other three on each machine is
> not hooked up to anything.
> >> On the execute machine i get this error in the logs fie
> >
> >> Master log__________
> >>
> >> 11/16 17:07:18 DaemonCore: Command Socket at
> <144.167.99.201:49652>>> 11/16 17:07:18 Started DaemonCore
> process "/root/Desktop/condor-7.2.4/sbin/condor_startd",
> pid and pgroup = 27436
> >> 11/16 17:07:23 attempt to connect to
> <144.167.99.210:9618> failed: No route to host (connect
> errno = 113). Will keep trying for 20 total seconds (20
> to go).
> >>
> >> 11/16 17:07:44 attempt to connect to
> <144.167.99.210:9618> failed: No route to host (connect
> errno = 113).
> >>
> >> StartLog__________
> >> 11/19 15:48:58 slot1: State change: IS_OWNER is false
> >> 11/19 15:48:58 slot1: Changing state: Owner -> Unclaimed
> >> 11/19 15:49:23 attempt to connect to
> <144.167.99.210:9618> failed: No route to host (connect
> errno = 113).
> >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP
> auth session to <144.167.99.210:9618>, but it failed.
> >> 11/19 15:49:23 Failed to start non-blocking update
> to <144.167.99.210:9618>.
> >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP
> auth session to <144.167.99.210:9618>, but it failed.
> >> 11/19 15:49:23 Failed to start non-blocking update
> to <144.167.99.210:9618>.
> >> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for TCP
> auth session to <144.167.99.210:9618>, but it failed.
> >> 11/19 15:49:23 Failed to start non-blocking update
> to <144.167.99.210:9618>.
> >> 11/19 15:49:23 ERROR: SECMAN:2004:Failed to create
> security session to <144.167.99.210:9618> with
> TCP.|SECMAN:2003:TCP connection to
> <144.167.99.210:9618> failed.
> >
> >
> >> The condor_collector Dameon is using the 9618
> socket on the central manager and thats the socket
> on the central manager that the execute machine is trying to
> connect to.. Why do the machines not connect in condor(No
> route to host??) when they can ping and ssh each other? Do
> i need to set something to make condor use the only
> network interface that is connected,? Or is it the socket that
> is being used by the collector on the central
> manager?
> >
> >
> >
> >
> >> Thanks for the much needed help.
> >
> >
> >
> >
> >>
> >
> >
> >
> > |
> > -----------------------------------------------------------
> > > _______________________________________________
> >> Condor-users mailing list
> >> To unsubscribe, send a message to condor-users-
> >> request@xxxxxxxxxxx with a
> >> subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/condor-users/
> >
> >
> >
> >
> > ---------------------------------------------------------------
> ---------
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-
> request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-
> request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/