Okay, thanks. I'll wait to hear from you before I look into it further.
Cheers,
-zach
> -----Original Message-----
> From: Brian Bockelman [mailto:bbockelm@xxxxxxxxxxx]
> Sent: Tuesday, March 28, 2017 1:05 PM
> To: Zach Miller <zmiller@xxxxxxxxxxx>
> Cc: Condor Developers <htcondor-devel@xxxxxxxxxxx>
> Subject: Re: [HTCondor-devel] Fwd: [Osg-gfactory-support] About IPv6 tests
> in ITB pool
>
> Actually, hold on there ...
>
> No one is able to confirm (yet) that they actually upgraded the
> condor_startd version to one that supports IPv6 as was suggested (grumble
> grumble). Let me get precise versions of all involved components (CCB,
> schedd, startd) to avoid setting you off on a goose chase (domestic or
> otherwise).
>
>
> Brian
>
>
> On Mar 28, 2017, at 12:55 PM, Zach Miller <zmiller@xxxxxxxxxxx
> <mailto:zmiller@xxxxxxxxxxx> > wrote:
>
> Huh. Although I am familiar with the security side of things, I
> have to admit I have no experience with IPv6. I will need to investigate,
> probably with Todd Miller's help. Thanks for the report and I will get
> back to you.
>
>
> Cheers,
> -zach
>
>
>
>
> -----Original Message-----
> From: HTCondor-devel [mailto:htcondor-devel-
> bounces@xxxxxxxxxxx] On Behalf
> Of Brian Bockelman
> Sent: Tuesday, March 28, 2017 12:44 PM
> To: Condor Developers <htcondor-devel@xxxxxxxxxxx
> <mailto:htcondor-devel@xxxxxxxxxxx> >
> Subject: [HTCondor-devel] Fwd: [Osg-gfactory-support] About
> IPv6 tests in
> ITB pool
>
> Hi HTCondor folk,
>
> The claim from the CMS pilot operators is that the following
> does not match
> IPv6 addresses:
>
> ALLOW_DAEMON=*
>
> (They've had to explicitly list each worker node's IP address
> to move
> forward in testing...)
>
> Can someone confirm / deny that fact?
>
> Additionally, can someone look at the CCB log [2] below?
> Seems the
> connection reversing of the startd back to schedd is
> attempting to go over
> v4, despite this being a V6-only host. MyAddress as sent by
> the CCB
> contains both V4 and V6; V4 appears to be selected.
> Thoughts?
>
> Thanks,
>
> Brian
>
>
>
> Begin forwarded message:
>
> From: Diego Davila Foyo <diego.davila@xxxxxxx
> <mailto:diego.davila@xxxxxxx>
> <mailto:diego.davila@xxxxxxx> >
>
> Subject: RE: [Osg-gfactory-support] About IPv6 tests in ITB
> pool
>
> Date: March 28, 2017 at 7:30:24 AM CDT
>
> To: Edgar M Fajardo Hernandez
> <emfajardohernandez@xxxxxxxxxxxxxxxx
> <mailto:emfajardohernandez@xxxxxxxxxxxxxxxx>
> <mailto:emfajardohernandez@xxxxxxxxxxxxxxxx> >
>
> Cc: Jeffrey Michael Dost <jdost@xxxxxxxx
> <mailto:jdost@xxxxxxxx> <mailto:jdost@xxxxxxxx> >,
> "bbockelm@xxxxxxxxxxx <mailto:bbockelm@xxxxxxxxxxx>
> <mailto:bbockelm@xxxxxxxxxxx> " <bbockelm@xxxxxxxxxxx
> <mailto:bbockelm@xxxxxxxxxxx>
> <mailto:bbockelm@xxxxxxxxxxx> >, Marian Zvada
> <Marian.Zvada@xxxxxxx <mailto:Marian.Zvada@xxxxxxx>
> <mailto:Marian.Zvada@xxxxxxx> >, "Farrukh Aftab Khan"
> <farrukh.aftab.khan@xxxxxxx
> <mailto:farrukh.aftab.khan@xxxxxxx> <mailto:farrukh.aftab.khan@xxxxxxx> >,
> "emfajard@xxxxxxxx <mailto:emfajard@xxxxxxxx>
> <mailto:emfajard@xxxxxxxx> " <emfajard@xxxxxxxx <mailto:emfajard@xxxxxxxx>
> <mailto:emfajard@xxxxxxxx> >, "osg-gfactory-
> support@xxxxxxxxxxxxxxxx <mailto:osg-gfactory-support@xxxxxxxxxxxxxxxx>
> <mailto:osg-gfactory-support@xxxxxxxxxxxxxxxx> " <osg-
> gfactory-
> support@xxxxxxxxxxxxxxxx <mailto:support@xxxxxxxxxxxxxxxx>
> <mailto:osg-gfactory-support@xxxxxxxxxxxxxxxx> >,
> Todor Trendafilov Ivanov <todor.trendafilov.ivanov@xxxxxxx
> <mailto:todor.trendafilov.ivanov@xxxxxxx>
> <mailto:todor.trendafilov.ivanov@xxxxxxx> >, Andrea Sciaba
> <Andrea.Sciaba@xxxxxxx <mailto:Andrea.Sciaba@xxxxxxx>
> <mailto:Andrea.Sciaba@xxxxxxx> >, Duncan Rand
> <duncan.rand@xxxxxxxxxxxxxx
> <mailto:duncan.rand@xxxxxxxxxxxxxx> <mailto:duncan.rand@xxxxxxxxxxxxxx> >,
> Marco
> Mascheroni <marco.mascheroni@xxxxxxx
> <mailto:marco.mascheroni@xxxxxxx> <mailto:marco.mascheroni@xxxxxxx> >,
> Raul Cardoso Lopes <raul.cardoso.lopes@xxxxxxx
> <mailto:raul.cardoso.lopes@xxxxxxx>
> <mailto:raul.cardoso.lopes@xxxxxxx> >
>
>
> Thank you Edgar, you were right about ALLOW_WRITE, but
> setting:
> ALLOW_DAEMON = $(ALLOW_DAEMON),pilot04@cms didn't work. I
> had to add
> Brunel's IPV6 adress explicitly, to the ALLOW_DAEMON to get
> it to work.
>
> After setting ALLOW_DAEMON =
> $(ALLOW_DAEMON),2001:630:10:f001::19a0
> in both the CCB and the Collector I started to see glideins
> connecting back
> to the collector. I had to do a special setting for HA in
> order to prevent
> the negotiator to go fermi Central Manager whenever I set
> PREFER_IPV4=False.
>
>
> I have setup one schedd for IPV6 and sent some test jobs. The
> negotiation process goes well now but now I see a problem
> with the claim
> process. In the logs I can see the following:
> Schedd [1].
> CCB [2]
> Startd [3]
>
> What I find strange is that the Startd is trying to connect
> to the
> Schedd (188.184.94.50) using ipv4. We couldn't find any
> reference to the
> ipv6 address of the schedd within the logs. Any thoughts?
>
> Regards,
>
> Diego
>
>
>
>
> [1]
> 03/28/17 11:33:34 Timed out requesting claim
> glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila after
> REQUEST_CLAIM_TIMEOUT=240 seconds.
> 03/28/17 11:33:34 Match record (glidein_3722464_389738448@wn-
> a3-18-
> 00.brunel.ac.uk <http://00.brunel.ac.uk/>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila, 171.2)
> deleted
> 03/28/17 11:33:34 Canceling request for claim
> glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila 171.2
> 03/28/17 11:33:34 SECMAN: resuming command 442 REQUEST_CLAIM
> to
> startd glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila from TCP port
> -1 (non-
> blocking).
> 03/28/17 11:33:34 SECMAN: TCP connection to startd
> glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila failed.
> 03/28/17 11:33:34 Failed to send REQUEST_CLAIM to startd
> glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila:
> SECMAN:2003:TCP connection
> to startd glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <mailto:glidein_3722464_389738448@xxxxxxxxxxxxxxxxxxxxxxxx>
> <127.0.0.1:21711>#1490692382#1#... for ddavila
> failed.|CEDAR:6007:operation
> was canceled
> 03/28/17 11:33:34 CLOSE TCP
> <[2001:1458:201:e4::100:62c]:16101>
> fd=17
>
> [2]
> 03/28/17 11:33:35 CCB: received request id 19416 from SCHEDD
> <188.184.94.50:4080?addrs=188.184.94.50-4080+[2001-1458-201-
> e4--100-62c]-
> 4080&noUDP&sock=23745_4d36_179> on
> <[2001:1458:201:e4::100:62c]:40045> for
> target ccbid 17198 (registered as STARTD
> <127.0.0.1:21711?addrs=[2001-630-
> 10-f001--19a0]-21711+127.0.0.1-21711&noUDP> on
> <[2001:630:10:f001::19a0]:8346>)
> 03/28/17 11:33:35 Address rewriting: refused for attribute
> MyAddress
> (MyAddress = "<188.184.94.50:4080?addrs=188.184.94.50-
> 4080+[2001-1458-201-
> e4--100-62c]-4080&noUDP&sock=23745_4d36_179>"): the address
> isn't my
> default address. (Default: <188.185.81.179:9644?addrs=[2001-
> 1458-d00-2--
> 100-1ad]-9644+188.185.81.179-9644>, found in ad:
> <188.184.94.50:4080?addrs=188.
> 184.94.50-4080+[2001-1458-201-e4--100-62c]-
> 4080&noUDP&sock=23745_4d36_179>)
> 03/28/17 11:33:35 encrypting secret
> 03/28/17 11:33:35 condor_write(fd=22 STARTD
> <127.0.0.1:21711?addrs=[2001-630-10-f001--19a0]-
> 21711+127.0.0.1-
> 21711&noUDP> on
>
> <[2001:630:10:f001::19a0]:8346>,,size=408,timeout=1,flags=0,non_bloc
> king=0)
> 03/28/17 11:34:31 condor_read(fd=22 STARTD
> <127.0.0.1:21711?addrs=[2001-630-10-f001--19a0]-
> 21711+127.0.0.1-
> 21711&noUDP> on
>
> <[2001:630:10:f001::19a0]:8346>,,size=21,timeout=1,flags=0,non_block
> ing=1)
> 03/28/17 11:34:31 condor_read(fd=22 STARTD
> <127.0.0.1:21711?addrs=[2001-630-10-f001--19a0]-
> 21711+127.0.0.1-
> 21711&noUDP> on
>
> <[2001:630:10:f001::19a0]:8346>,,size=263,timeout=1,flags=0,non_bloc
> king=1)
> 03/28/17 11:34:31 encrypting secret
> 03/28/17 11:34:31 CCB: received error from target daemon
> STARTD
> <127.0.0.1:21711?addrs=[2001-630-10-f001--19a0]-
> 21711+127.0.0.1-
> 21711&noUDP> on <[2001:630:10:f001::19a0]:8346> with ccbid
> 17198 for
> request 19415 from (client which has gone away): failed to
> connect
> 03/28/17 11:34:31 CCB: client for request 19415 to target
> daemon
> STARTD <127.0.0.1:21711?addrs=[2001-630-10-f001--19a0]-
> 21711+127.0.0.1-
> 21711&noUDP> on <[2001:630:10:f001::19a0]:8346> with ccbid
> 17198
> disappeared before receiving error details.
> 03/28/17 11:35:02 CollectorAd : Updating ... "< Personal
> Condor at
> vocms0803.cern.ch@xxxxxxxxxxxxxxxxx
> <mailto:vocms0803.cern.ch@xxxxxxxxxxxxxxxxx>
> <mailto:vocms0803.cern.ch@xxxxxxxxxxxxxxxxx> >"
> 03/28/17 11:35:02 Trying to update collector
> <[2001:1458:201:e4::100:535]:9618>
> 03/28/17 11:35:02 Attempting to send update via UDP to
> collector
> vocms0807.cern.ch <http://vocms0807.cern.ch/>
> <http://vocms0807.cern.ch/>
> <[2001:1458:201:e4::100:535]:9618>
> 03/28/17 11:35:02 Guess address string for host =
> <[2001:1458:201:e4::100:535]:9618>, port = 0
> 03/28/17 11:35:02 it was sinful string. ip =
> 2001:1458:201:e4::100:535, port = 9618
> 03/28/17 11:35:02 _condorOutMsg MTU changed from default to
> 60000
> 03/28/17 11:35:02 SECMAN: command 19 UPDATE_COLLECTOR_AD to
> collector vocms0807.cern.ch:9618
> <http://vocms0807.cern.ch:9618/> <http://vocms0807.cern.ch:9618/> from
> UDP
> port 32109 (blocking, raw).
> 03/28/17 11:35:02 SECMAN: no cached key for
> {<[2001:1458:201:e4::100:535]:9618>,<19>}.
> 03/28/17 11:35:02 SECMAN: Security Policy:
>
>
> [3]
> 03/28/17 09:44:49 (pid:3625703) attempt to connect to
> <131.225.205.29:9668> failed: Network is unreachable (connect
> errno = 101).
> 03/28/17 09:44:49 (pid:3625703) ERROR: SECMAN:2003:TCP
> connection to
> collector cmssrv215.fnal.gov:9668
> <http://cmssrv215.fnal.gov:9668/> <http://cmssrv215.fnal.gov:9668/>
> failed.
> 03/28/17 09:44:49 (pid:3625703) Failed to start non-blocking
> update
> to <131.225.205.29:9668>.
> 03/28/17 09:48:26 (pid:3625703) attempt to connect to
> <188.184.94.50:4080> failed: Network is unreachable (connect
> errno = 101).
> Will keep trying for 300 total seconds (300 to go).
>
> 03/28/17 09:49:03 (pid:3625703) attempt to connect to
> <188.184.94.50:4080> failed: Network is unreachable (connect
> errno = 101).
>
|