Thanks Coop for your quick reply. However, the problem remains in
6.9.5, even having taken the steps that you describe (I just reverified this to
make sure). I have managed to get vanilla RUN_AS_OWNER
jobs working with 6.9.5, by using CREDD_HOST=$(CONDOR_HOST) (i.e. without the
port setting) on both master and execute node. But the real prize for me is to
be able to run vm-universe jobs with RUN_AS_OWNER, and I still cannot make this
work with a shared filesystem. Looking at the vm_gahp log below seems to
indicate that even with: run_as_owner = true specified in the job file, VM_UNIV_NOBODY_USER specified to a user with a home
directory in the config file, ALLOW_USERS specified to the same user in the config_vmgapp.vmware
file the vm process seems to be
launched with system credentials SYSTEM@NT AUTHORITY that are insufficient to access
the shared virtual machine files. I have confirmed that these files *are*
visible to a vanilla job run on the same execute node with RUN_AS_OWNER = true. Maybe these are the perils of
running a pre-release development version... Malcolm 1/8 11:41:03
****************************************************** 1/8 11:41:03 **
condor_vm-gahp.exe (CONDOR_VM_GAHP) STARTING UP 1/8 11:41:03 **
C:\condor\bin\condor_vm-gahp.exe 1/8 11:41:03 **
$CondorVersion: 6.9.5 Nov 28 2007 $ 1/8 11:41:03 ** $CondorPlatform:
INTEL-WINNT50 $ 1/8 11:41:03 ** PID = 904 1/8 11:41:03 ** Log last
touched 1/8 11:34:11 1/8 11:41:03
****************************************************** 1/8 11:41:03 Using config
source: C:\condor\condor_config 1/8 11:41:03 Using local
config sources: 1/8
11:41:03 C:\condor/condor_config.local 1/8 11:41:03 DaemonCore:
Command Socket at <192.168.199.190:4756> 1/8 11:41:03 VMGAHP[904]:
VM-GAHP initialized with run-mode 1 1/8 11:41:03 VMGAHP[904]:
Initialize Uids: caller=SYSTEM@NT AUTHORITY, job user=SYSTEM@NT AUTHORITY 1/8 11:41:03 VMGAHP[904]:
Starting worker : C:\condor/bin/condor_vm-gahp.exe -f -t -M 2 1/8 11:41:03 VMGAHP[904]:
Worker pid=1588 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ****************************************************** 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** condor_vm-gahp.exe (CONDOR_VM_GAHP) STARTING UP 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** C:\condor\bin\condor_vm-gahp.exe 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** $CondorVersion: 6.9.5 Nov 28 2007 $ 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** $CondorPlatform: INTEL-WINNT50 $ 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** PID = 1588 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 ** Log last touched time unavailable (No error) 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03
****************************************************** 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 Using config source: C:\condor\condor_config 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 Using local config sources: 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 C:\condor/condor_config.local 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: 1/8 11:41:03 DaemonCore: Command Socket at
<192.168.199.190:4759> 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: VM-GAHP initialized with run-mode 2 1/8 11:41:03 VMGAHP[904]:
Worker[1588]: Initialize Uids: caller=SYSTEM@NT AUTHORITY, job user=SYSTEM@NT
AUTHORITY 1/8 11:41:24 condor_read():
timeout reading 5 bytes from <192.168.199.190:4752>. 1/8 11:41:24 IO: Failed to
read packet header 1/8 11:41:27 VMGAHP[904]:
Worker[1588]: Warning: creating filesystem with (nonstandard) Joliet extensions 1/8 11:41:27 VMGAHP[904]:
Worker[1588]: but without
(standard) Rock Ridge extensions. 1/8 11:41:27 VMGAHP[904]:
Worker[1588]: It is
highly recommended to add Rock Ridge 1/8 11:41:28 VMGAHP[904]:
Worker[1588]: File(\\xxxx\xxxx\xxx\condor1\VM\vm_test\vm_test-000001.vmdk)
can't be read 1/8 11:41:28 VMGAHP[904]:
Worker[1588]: file(\\xxxx\xxxx\xxx\condor1\VM\vm_test\vm_test-000001.vmdk) in a
vmx file cannot be read 1/8 11:41:34 VMGAHP[904]:
EOF reached on DaemonCore pipe 65541 1/8 11:41:34 VMGAHP[904]: VM
GAHP Worker stderr buffer closed, exiting... From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Thompson, Cooper Starting simply: you need to run the “condor_store_cred
–c add” command, and then restart Condor (using ‘net
stop condor && net start condor”) before the
LOCAL_CRED=<name>:<port> will appear in the ClassAd. I believe
a condor_reconfig or a partial restart is not sufficient. The stored password is not removed when uninstalling Condor, so if
you ran the condor_store_cred command without restarting Condor, and then
rolled back to 6.8.8, that may have caused it to work. Coop From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Malcolm Wilkins I am trying to set
up a small Condor pool with one submit only/master node (Vista) and one
submit/execute node (XP). I have been trying (unsuccessfully) to get the
RUN_AS_OWNER feature working, so that jobs submitted will be run under the
credentials of the submitter. The jobs remain in the idle queue and do not run:
using condor_q –analyze indicates the problem may be that the job
requires that the execute node must advertise “LOCAL_CREDD = <hostname
of CREDD host>:9620”. Specifying
CREDD_HOST=$(CONDOR_HOST):$(CREDD_PORT) in the condor_config on the execute
node (the default) *does not* work as expected: instead of displaying
“LOCAL_CREDD = <hostname of CREDD host>:9620” in
response to condor_status -long, no information is displayed at all.
(However, specifying CREDD_HOST=$(CONDOR_HOST) does causes “LOCAL_CREDD =
<hostname of CREDD host>” to be displayed). I then tried
reverting to 6.8.8 and this version *does* display the full information
i.e. “LOCAL_CREDD = <hostname of CREDD host>:9620” for the
execute node. Has anyone else
come across such a problem, and is there a workaround? |