Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] issues getting with condor
- Date: Wed, 29 May 2013 16:44:13 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] issues getting with condor
On 5/29/2013 3:31 PM, Dunn, George Jr wrote:
Hi all,
I have installed condor from source, the tarball, and the repo (I am on
CentOS 6) all with similar results.
There are two questions I have at this point.
1)When people say that the daemons must be started as root. Does this
mean that they should all show up as running as root?
By default, the ps command shows effective uid of the process, not the
"real" uid. When you start the HTCondor daemons as root, they try to
spend 99% of their time running as user "condor" and only switch back to
effective user root when they need to do something as root (this is
defensive programming). So even if you start the condor_master as user
root, typically "ps" will show it running as user "condor".
To verify that the daemons really have root access (e.g. that
condor_master was started as root, as required for HTCondor to run jobs
as the submitting user), you could do "ps axo pid,ruid,cmd" to display
the real uid (ruid) for each process -- an ruid of 0 is root.
Or the ReadUid also appears in the master classad, so you could do
condor_status -master -l | grep RealUid
and verify that RealUid is 0 for all machines.
2)If so and that is not the case (ie all but condor_procd are running as
the user condor) Is this why I am getting
Failed to open '/home/<user>/condor-test/simple.out' as standard output:
Permission denied (errno 13)
when I try the example here:
http://research.cs.wisc.edu/htcondor/tutorials/intl-grid-school-3/submit_first.html)
Or here:
http://spinningmatt.wordpress.com/2010/07/26/getting-started-installing-a-single-node-condor-pool/
I saw an earlier mailing list question from 2007 that seems to address
this issue (hence question 1)
https://lists.cs.wisc.edu/archive/htcondor-users/2007-April/msg00175.shtml
It also mentions the UID_DOMAIN name matching but at this point this
node has a resolvable FQDN that is set as the hostname and is the only
node in the pool and has manager, submit, and execute roles.
Can anyone please help? I REALLY want to use this ! J
Did you configure to use slot users in your config file with
SLOT<N>_USER? I am guessing you did not, but if you did then the slot
user specified must have access.
Assuming you did not configure slot users, try setting
TRUST_UID_DOMAIN = True
SOFT_UID_DOMAIN = True
in your condor_config file(s) then do a condor_reconfig.
regards,
Todd