[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Standard Universe Jobs - File Permission Errors

Date: Tue, 24 Jul 2012 11:20:39 -0500
From: jhowes@xxxxxxxxxxxxxxxx
Subject: Re: [Condor-users] Standard Universe Jobs - File Permission Errors

If you have not already done this, you should check what file causesthe error. Read or write?

Have you checked UIDs numbers for the userids on each machine? Is itpossible they are not in sync? Group Ids also?


Just a couple things I would check for.


On 2012-07-24 10:41, Paul Browne wrote:

Hi, 

Thanks for replying.

These are all Linux nodes, all running Condor.x86_64.7.8.1.rhel4 on
Scientific Linux.

The problem is that Vanilla Universe jobs running on a different
execute machine than submit machine can't write to a network mounted
directory, despite this directory being present in the same place on

all machines. It must be a permissions issue of some kind, but Ican't

work it out.

Regards,
Paul Browne

On 24 July 2012 15:33, <jhowes@xxxxxxxxxxxxxxxx [10]> wrote:

Are your nodes running Linux or Windows?

On 2012-07-23 12:18, Paul Browne wrote

Apologies, I am referring to network problems running Vanilla
universe
jobs, not Standard universe.

Standard universe jobs run fine, but can't be used for our needs.

Regards,
Paul Browne

On 23 July 2012 18:16, Paul Browne <pb337@xxxxxxxxxxxxxxxx [2]
[2]> wrote:

We have a small Condor 7.8.1 x86_64 pool of one central manager
(also a submit & execute machine) & two submit/execute
machines.

When a standard universe job is submitted from one machine that
requires I/O access to directories which are NFS mounted in the
same
place on each machine, the jobs will not run or produce output
due
to file permission errors.

So a job submitted on one machine will only run on the machine
it
was submitted from, & will not run on any other machine. I
have
tried to read the admin manual about how to resolve this issue
without making network mounted directories world-writeable
(which
would certainly work), but haven't made progress.

Might anyone have ideas about how our pool configuration might
be
resolved to allow Condor jobs which are submitted from one
machine
to execute on another machine, when they need I/O access to
directories which have been NFS mounted in the same places on
all
machines in the pool?

This is a major problem, reducing our capability for
time-sensitive
computations by (at present) a full two thirds, so any help
would be
very, very welcome.

Kind regards,
Paul Browne

 

--
__________________________________
Mr. Paul Browne
School of Physics & Astronomy,
University of St Andrews,
North Haugh, St Andrews,
Fife, KY16 9SS,
Scotland, UK

t:  +44 (0)1334 46 3152
e:  pb337@xxxxxxxxxxxxxxxx [1] [1]
__________________________________


--
__________________________________
Mr. Paul Browne
School of Physics & Astronomy,
University of St Andrews,
North Haugh, St Andrews,
 Fife, KY16 9SS,
Scotland, UK

t:  +44 (0)1334 46 3152
e:  pb337@xxxxxxxxxxxxxxxx [3] [3]
__________________________________

Links:
------
[1] mailto:pb337@xxxxxxxxxxxxxxxx [4]
[2] mailto:pb337@xxxxxxxxxxxxxxxx [5]
[3] mailto:pb337@xxxxxxxxxxxxxxxx [6]


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
[7] with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users [8]

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/ [9]


--
__________________________________
Mr. Paul Browne
School of Physics & Astronomy,
University of St Andrews,
North Haugh, St Andrews,
Fife, KY16 9SS,
 Scotland, UK

t:  +44 (0)1334 46 3152
e:  pb337@xxxxxxxxxxxxxxxx [11]
__________________________________


Links:
------
[1] mailto:pb337@xxxxxxxxxxxxxxxx
[2] mailto:pb337@xxxxxxxxxxxxxxxx
[3] mailto:pb337@xxxxxxxxxxxxxxxx
[4] mailto:pb337@xxxxxxxxxxxxxxxx
[5] mailto:pb337@xxxxxxxxxxxxxxxx
[6] mailto:pb337@xxxxxxxxxxxxxxxx
[7] mailto:condor-users-request@xxxxxxxxxxx
[8] https://lists.cs.wisc.edu/mailman/listinfo/condor-users
[9] https://lists.cs.wisc.edu/archive/condor-users/
[10] mailto:jhowes@xxxxxxxxxxxxxxxx
[11] mailto:pb337@xxxxxxxxxxxxxxxx

Follow-Ups:
- Re: [Condor-users] Standard Universe Jobs - File Permission Errors
  - From: Paul Browne

References:
- [Condor-users] Standard Universe Jobs - File Permission Errors
  - From: Paul Browne
- Re: [Condor-users] Standard Universe Jobs - File Permission Errors
  - From: Paul Browne
- Re: [Condor-users] Standard Universe Jobs - File Permission Errors
  - From: jhowes
- Re: [Condor-users] Standard Universe Jobs - File Permission Errors
  - From: Paul Browne

Prev by Date: Re: [Condor-users] Standard Universe Jobs - File Permission Errors
Next by Date: Re: [Condor-users] Standard Universe Jobs - File Permission Errors
Previous by thread: Re: [Condor-users] Standard Universe Jobs - File Permission Errors
Next by thread: Re: [Condor-users] Standard Universe Jobs - File Permission Errors
Index(es):
- Date
- Thread