Genie,
Is your condor execute directory on NFS with root squashing? The following line is what makes me guess that it might be:If EXECUTE is on a NFS mount with root squashing, then it needs to be world-writable.
01/13 06:32:30 get_file(): Failed to open file /home/condor/execute/dir_22496/condor_exec.exe, errno = 13: Permission denied.
--Dan
Genie Jhang wrote:
_______________________________________________Hello, again.01/13 06:32:30 DaemonCore: Command Socket at <192.168.0.105:33714 <http://192.168.0.105:33714>>
Thanks to all of you, I succeed to run and to connect all the machines our lab have.
But, when I finally tried to submit jobs to machines, I found that all the other machines except central manager doesn't work!!
and I dug the log files.
Here's the log.
---------------------------------------------------------------------------------------------------------------------------------- 01/13 06:32:30 ******************************************************
01/13 06:32:30 ** condor_starter (CONDOR_STARTER) STARTING UP
01/13 06:32:30 ** /condor/sbin/condor_starter
01/13 06:32:30 ** SubsystemInfo: name=STARTER type=STARTER(8) class=DAEMON(1)
01/13 06:32:30 ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON
01/13 06:32:30 ** $CondorVersion: 7.4.1 Dec 17 2009 BuildID: 204351 $
01/13 06:32:30 ** $CondorPlatform: I386-LINUX_RHEL3 $
01/13 06:32:30 ** PID = 22496
01/13 06:32:30 ** Log last touched time unavailable (No such file or directory)
01/13 06:32:30 ******************************************************
01/13 06:32:30 Using config source: /condor/etc/condor_config
01/13 06:32:30 Using local config sources:
01/13 06:32:30 /home/condor/condor_config.local01/13 06:32:30 Communicating with shadow <192.168.0.109:55237 <http://192.168.0.109:55237>>
01/13 06:32:30 Done setting resource limits------------------------------------------------------------------------
01/13 06:32:30 Submitting machine is "pheko09"
01/13 06:32:30 setting the orig job name in starter
01/13 06:32:30 setting the orig job iwd in starter
01/13 06:32:30 get_file(): Failed to open file /home/condor/execute/dir_22496/condor_exec.exe, errno = 13: Permission denied.
01/13 06:32:30 get_file(): consumed 28023 bytes of file transmission
01/13 06:32:30 DoDownload: consuming rest of transfer and failing after encountering the following error: STARTER at 192.168.0.105 failed to write to file /home/condor/execute/dir_22496/condor_exec.exe: (errno 13) Permission denied
01/13 06:32:30 WARNING: File /home/condor/execute/dir_22496/condor_exec.exe can not be accessed by Quill file transfer tracking.
01/13 06:32:30 File transfer failed (status=0).
01/13 06:32:30 ERROR "Failed to transfer files" at line 1882 in file jic_shadow.cpp
01/13 06:32:30 ShutdownFast all jobs.
------------------------------------------------------------------------------------------------------------------------------------
What on the earth is the problem?
I set ALLOW_WRITE = * in condor_config file of all the machines.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/