Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] 6.8.0 and NFS Problem
Hi,
strange problem with 6.8.0 and NFS:
Central Manager is 6.8.0 on Linux, is also NFS server.
Submit and execute machine is a 6.8.0 on Linux, NFS client to above
machine.
in the global config I have:
USE_NFS = True
and
FILESYSTEM_DOMAIN = a.b.c
while the machines are called serv.a.b.c and cli1.a.b.c
Since both machines have two network cards, I added in the local configs
the respective IP addresses of the machines in a
NETWORK_INTERFACE = 1.2.3.4
statement.
condor_submit complains when I start a job on the NFS client, that has its
log file on NFS:
~> condor_submit ls.job
Submitting job(s)
WARNING: Log file /home/vetter/test.log is on NFS.
This could cause log file corruption and is _not_ recommended.
.
Logging submit event(s).
1 job(s) submitted to cluster 43.
When I start the job on the NFS server it works well.
The Warning from condor submit also results in condor_run not working:
~> condor_run hostname
Condor does not have write permission to this directory.
If I cd to /tmp, it works:
/tmp> condor_submit ~/ls.job
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 44.
Questions:
Is there anything wrong with my setup? I have a 6.7.6 on different
machines, with different central manager, that use this NFS server. no
problems.
Is there a change in condor since 6.7.6 regarding chown-ing files to and
from condor? I found in Schedlog of client (i change nambers to names):
8/3 10:04:25 (pid:29713) Error: Unable to chown
'/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0'
from condor to vetter.magic
8/3 10:04:25 (pid:29713) (44.0) Failed to chown
/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0 from condor
to vetter.magic. Job may run into permissions problems when it starts.
8/3 10:04:25 (pid:29713) Error: Unable to chown
'/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0.tmp' from
condor to vetter.magic
8/3 10:04:25 (pid:29713) (44.0) Failed to chown
/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0.tmp from condor to
vetter.magic. Job may run into permissions problems when it starts.
8/3 10:04:25 (pid:29566) Starting add_shadow_birthdate(44.0)
8/3 10:04:25 (pid:29566) Started shadow for job 44.0 on
"<132.187.47.29:18672>", (shadow pid = 29714)
8/3 10:04:25 (pid:29566) Shadow pid 29714 for job 44.0 exited with status
100
8/3 10:04:25 (pid:29566) match (<1.2.3.4:18672>#1154544604#60) out
of jobs (cluster id 44); relinquishing
8/3 10:04:25 (pid:29566) Sent RELEASE_CLAIM to startd on
<132.187.47.29:18672>
8/3 10:04:25 (pid:29566) Match record (<1.2.3.4:18672>, 44, -1)
deleted
8/3 10:04:25 (pid:29722) Error: Unable to chown
'/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0'
from vetter to condor.condor
8/3 10:04:25 (pid:29722) (44.0) Failed to chown
/home/condor/hosts/dc09/spool/cluster44.proc0.subproc0 from vetter to
condor.condor. User may run into permissions problems when
fetching sandbox.
--
Andreas Vetter Tel: +49 (0)931 888-5890
Fakultaet fuer Physik und Astronomie Fax: +49 (0)931 888-5508
Universitaet Wuerzburg