Hi,
I had the same problem as Jeffrey
in the same scenario (see Mails below).
I tried to fix it
using
condor_store_cred –c
add.
After that, when submitting a job
from slave, the following problem occurred:
No credential stored for
Tom@SLAVE
But
condor_store_cred
add
complains:
make sure your
HOSTALLOW_WRITE setting includes this host.
This surprises me, since
HOSTALLOW_WRITE and HOSTALLOW_CONFIG are set to * on all
machines.
Does anyone have a
hint?
Best regards,
Tom Paschenda
When your job tries
to start, it probably uses a shared pool password to authenticate against the
credd. Did you set the shared pool password on all machines?
condor_store_credd -c
add
Mike
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jeffrey Stephen
Sent: 08 March 2007 06:40
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] jobs don't run
when using condor_credd
Hi,
I am trying to set up condor_credd
on Windows XP. I have a central manager machine (nes30700) and one
submit/execute (ie. slave) machine (nes15300). The slave machine is
configured to always run jobs:
=================================================================
>
condor_status
Name
OpSys Arch
State Activity LoadAv
Mem ActvtyTime
vm1@NES30700. WINNT51
INTEL Owner
Idle 0.040 1023
0+00:05:15
vm2@NES30700. WINNT51
INTEL Owner
Idle 0.000 1023
0+00:05:16
nes15300.land WINNT51
INTEL Unclaimed Idle
-0.010 1022 0+00:09:55
=================================================================
To run jobs I had to use
"condor_store_cred" to set my password. I did this on both the central manager
and slave manager. (Is that correct?)
Once that was done, I could
successfully run a test program using condor_submit.
I want to use a shared filesystem,
so I tried to set up condor_credd. I did the following:
1. copied the example file
(etc/condor_config.local.credd) into condor_config.local in the condor main
directory on both the central manager and the slave
machines;
2. added the following lines to
the condor_config file (on both the central manager and the slave
machines):
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_HOST =
nes30700.lands.resnet.qg
CREDD_CACHE_LOCALLY =
True
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI,
PASSWORD
3. Modified condor_config file (on
both the central manager and the slave machines):
COLLECTOR_NAME =
QCCCE_condor
where "QCCCE_condor"
is the name of my condor pool
4. started condor on both the
central manager and the slave machines (using net start
condor)
The condor_master,
condor_collector, condor_credd, condor_negotiator, condor_schedd and
condor_startd) daemons started on both machines. I thought condor_negotiator
and condor_collector were only supposed to run on the central manager machine,
but they were running on the both the central manager and the slave
machine.
5. added "run_as_owner = true" to
the job config file