Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?

Date: Fri, 24 Apr 2009 10:33:39 -0500
From: Dan Bradley <dan@xxxxxxxxxxxx>
Subject: Re: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?

Mark,

I'm guessing that you are not setting BIND_ALL_INTERFACES.

Starting in 7.1.1, BIND_ALL_INTERFACES is True by default. This meansthat setting NETWORK_INTERFACE without also settingBIND_ALL_INTERFACES=False just has the effect of controlling whichinterface Condor advertises, not which one it actually binds to (itbinds to all of them and will therefore use whichever one the OS choosesin a particular case).

So I recommend setting BIND_ALL_INTERFACES=False and seeing if thisaddresses your problem.


--Dan

Mark Calleja wrote:

Hi All,
(Apologies if you receive multiple copies of this post. Thecamgrid-users mailing list appears to be blocking another of my emailaddresses.)
We currently run several pools (all linux) with v7.0.5 and are lookingto upgrade piecemeal to v7.2.2. Encouraged by the entry in section 8.2of the v7.2.2 manual, namely "We believe that Condor 7.2.x and 7.0.xare wire-compatible, and can be freely mixed between computers in aCondor pool.", we've been testing upgrading some machines. However,we're seeing jobs getting rejected when the schedd is running 7.0.5and the startd is running 7.2.2. No other changes have been made, i.e.the configuration files have remained the same. Before I paste in therelevant parts of the log files, a bit of background: many of ourmachines have multiple IP addresses but Condor is forced to operateusing a specific address, selected by the NETWORK_INTERFACE value in amachine's condor_config.local file. This address is always a "private"(RFC 1918) address in the range 172.24.xxx.xxx.
Here's an example. The submit host has IP address 172.24.252.25 only,whereas the execute has two addresses: 131.111.xxx.xxx (which should*not* be used by Condor) and 172.24.116.4 (which should). So, here'sthe SchedLog from the submit host for when both submit and executehost are running 7.0.5 (job completes correctly):
4/20 17:45:08 Using config source: /etc/condor/condor_config
4/20 17:45:08 Using local config sources:
4/20 17:45:08    /usr/local/condor/local/condor_config.local
4/20 17:45:08    /usr/local/condor/local/condor_config.flocking
4/20 17:45:08 DaemonCore: Command Socket at <172.24.252.25:13743<http://172.24.252.25:13743>>
4/20 17:45:08 Initializing a VANILLA shadow for job 8.0
4/20 17:45:08 (8.0) (3799): Request to run on <172.24.116.4:9692<http://172.24.116.4:9692>> was ACCEPTED
4/20 17:45:09 (8.0) (3799): ZKM: setting default map to (null)
4/20 17:45:09 (8.0) (3799): Job 8.0 terminated: exited with status 0
4/20 17:45:09 (8.0) (3799): **** condor_shadow (condor_SHADOW) EXITINGWITH STATUS 100
Now the corresponding relevant snippet for when the execute host hasbeen upgraded to 7.2.2 (job fails as file transfer does not take place):
4/18 06:19:52 Using config source: /etc/condor/condor_config
4/18 06:19:52 Using local config sources:
4/18 06:19:52    /usr/local/condor/local/condor_config.local
4/18 06:19:52    /usr/local/condor/local/condor_config.flocking
4/18 06:19:52 DaemonCore: Command Socket at <172.24.252.25:14228<http://172.24.252.25:14228>>
4/18 06:19:52 Initializing a VANILLA shadow for job 6.0
4/18 06:19:52 (6.0) (3719): Request to run on <172.24.116.4:9668<http://172.24.116.4:9668>> was ACCEPTED4/18 06:19:52 (6.0) (3719): DaemonCore: PERMISSION DENIED to unknownuser from host <131.111.xxx.xxx:9633> for command 61000(FILETRANS_UPLOAD), access level WRITE4/18 06:19:52 (6.0) (3719): ERROR "Error from starter onXXXX.escience.cam.ac.uk <http://XXXX.escience.cam.ac.uk>: Failed totransfer files" at line 649 in file pseudo_ops.C
It would appear that in 7.2.2 Condor's trying to make use of aninterface on the execute host that's not the one nominated inNETWORK_INTERFACE (in this case it's the canonical, globally routeableaddress). Is there any reason why this has changed from 7.0.5? And isthere any way of getting 7.2.2 to conform with the desired 7.0.5behaviour?
Best regards,
Mark
------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:https://lists.cs.wisc.edu/archive/condor-users/

Follow-Ups:
- Re: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?
  - From: Mark Calleja

References:
- [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?
  - From: Mark Calleja

Prev by Date: [Condor-users] recommended version of GCC for Slackware 12
Next by Date: Re: [Condor-users] Looping in Condor job description scripts?
Previous by thread: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?
Next by thread: Re: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [Condor-users] Flocking problem for 7.0.5->7.2.2 submission. New security issue?