[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_transfer_files gives "Permission denied" with Condor 7.6.1 while it worked with Condor 7.4.4



Hi,

I'm evaluating Condor 7.6.1 in order to decide if I should upgrade our
pool to the new version (from 7.4.4). However, I found an issues which
probably will prevent me from doing so as  a function is affected
which is essential for my pool. I just wanted to report it in case you
are not aware of the problems:

Happens on Windows XP x64 and Windows Server 2003 R2 (maybe other
Windows versions are affected as well but I just testet on those two
platforms): condor_transfer_data always comes back with a permission
denied message. The short message I always get when I invoke the tool
is:

DCSchedd::receiveJobSandbox:7003:File transfer failed for target job
3.0: SCHEDD at 10.2.4.60 failed to send file(s) to <10.2.4.60:3473>:
error reading from
C:\Condor/spool\3\0\cluster3.proc0.subproc0\dipole.out.log: permission
denied; TOOL failed to receive file(s) from <10.2.4.60:3345>
ERROR: Failed to spool job file

I've checked that permissions on the folder the tool complains about
are OK and I tried the exact same job with identical configuration
file entries on the same machine with Condor 7.4.4 and it worked
without problems. Whatever I try with Condor 7.6.1 I always get this
message. In the logfile of the condor_schedd I see at the same time:

07/12/11 15:26:46 Perm::GetAclInformation failed with error 122
07/12/11 15:26:46 DoUpload: (Condor error code 13, subcode 1) SCHEDD
at 10.2.4.60 failed to send file(s) to <10.2.4.60:3567>: error reading
from C:\Condor/spool\3\0\cluster3.proc0.subproc0\dipole.out.log:
permission denied; TOOL failed to receive file(s) from
<10.2.4.60:3537>
07/12/11 15:26:46 generalJobFilesWorkerThread(): failed to transfer
files for job 3.0
07/12/11 15:26:46 condor_write(): Socket closed when trying to write
13 bytes to <10.2.4.60:3567>, fd is 1048
07/12/11 15:26:46 Buf::write(): condor_write() failed
07/12/11 15:26:46 Return from HandleReq <spoolJobFiles> (handler:
0.047s, sec: 0.296s)
07/12/11 15:26:46 Return from Handler
<DaemonCore::HandleReqSocketHandler> 0.3430s
07/12/11 15:26:46 DaemonCore: fake thread 16 exited with status 0,
invoking reaper 6 <transferJobFilesReaper>
07/12/11 15:26:46 ERROR - Staging of job files failed!
07/12/11 15:26:46 DaemonCore: return from reaper for pid 16

When I try the same with Condor 7.4.4 (using the same debug level) I
can see in the condor_schedd logfile:

07/12 16:14:13 DoUpload: send file dipole.out.log
07/12 16:14:13 Calling Perm::userInAce() for CST\FelixWolfheimer
07/12 16:14:13 perm::UserInAce: Checking CST\FelixWolfheimer
07/12 16:14:13 Calling Perm::userInAce() for CST\FelixWolfheimer
07/12 16:14:13 perm::UserInAce: Checking BUILTIN\Administrators
07/12 16:14:13 in perm::userInLocalGroup() looking at group 'Administrators'
07/12 16:14:13 Calling Perm::userInAce() for CST\FelixWolfheimer
07/12 16:14:13 perm::UserInAce: Checking \Everyone
07/12 16:14:13 FILETRANSFER: outgoing file_command is 1 for dipole.out.log
07/12 16:14:13 ReliSock::put_file_with_permissions(): going to send
permissions 0
07/12 16:14:13 put_file: going to send from filename
C:\Condor/spool\cluster1.proc0.subproc0\dipole.out.log
07/12 16:14:13 put_file: Found file size 8203
07/12 16:14:13 put_file: sending 8203 bytes
07/12 16:14:13 ReliSock: put_file: sent 8203 bytes

It seems to me that the "Perm::GetAclInformation failed with error
122" is somehow the critical point. Condor 7.4.4 does not seem to
perform this step. Does anyone have any ideas? It would be really nice
to be able to use Condor 7.6 as it has some really nice features.