[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Regarding failed to copy files back to spool directory.



Hi,
    when Starter send the file back to spool directory after
condor_vacate_job it started transfer some 19 files in between it failed
to transfer one file due to permission issue. It put the job into hold
and stop transferring.

why can't condor skip that particular file and continue transferring
remaining files to the spool. so we won't last all files. Is it a bug.

I attached the Starter Log.

5/13 19:27:22 Sending changed file vmware.log, t: 1242223031,
1242205712, s: 1010653, 3034
5/13 19:27:22 Sending changed file vmware-0.log, t: 1242205712,
1242205713, s: 3034, 568994
5/13 19:27:22 Sending changed file vmj9mEQU_condor.vmx, t: 1242205909,
1242205656, s: 1062, 1062
5/13 19:27:22 Skipping file vmj9mEQU_condor.vmsd, t:
1242205675==1242205675, s: 563==563
5/13 19:27:22 Sending changed file windowsxp-000001-s004.vmdk, t:
1242222793, 1242205718, s: 32571392, 31064064
5/13 19:27:23 Sending new file vmj9mEQU_condor.vmem, time==1242223043,
size==567279616
5/13 19:27:23 Skipping file vmj9mEQU_condor-Snapshot1.vmsn, t:
1242205675==1242205675, s: 8941==8941
5/13 19:27:23 Skipping file vmj9mEQU_condor-disk2-vmware.vmdk, t:
1242205637==1242205637, s: 589824==589824
5/13 19:27:23 Sending new file vmj9mEQU_condor.vmss, time==1242222795,
size==53493
5/13 19:27:23 Sending changed file nvram, t: 1242205909, 1242205656, s:
8664, 8664
5/13 19:27:23 Sending new file vmware-1.log, time==1242205713,
size==568994
5/13 19:27:23 Sending new file vmj9mEQU_condor-swapdisk-vmware.vmdk,
time==1242222792, size==13303808
5/13 19:27:23 Sending changed file windowsxp-000001-s001.vmdk, t:
1242222793, 1242205712, s: 343605248, 215613440
5/13 19:27:23 Sending changed file windowsxp-000001-s005.vmdk, t:
1242222794, 1242205712, s: 65536, 65536
5/13 19:27:23 Sending changed file windowsxp-000001-s003.vmdk, t:
1242222793, 1242205675, s: 246546432, 97517568
5/13 19:27:23 Skipping file windowsxp.vmx, t: 1242205636==1242205636, s:
1085==1085
5/13 19:27:23 Sending changed file windowsxp-000001.vmdk, t: 1242206245,
1242205712, s: 476, 476
5/13 19:27:23 Sending changed file windowsxp-000001-s002.vmdk, t:
1242222793, 1242205656, s: 147914752, 101908480
5/13 19:27:23 Sending changed file
vmj9mEQU_condor-disk2-vmware-000001.vmdk, t: 1242222793, 1242205675, s:
24903680, 589824
5/13 19:27:23 FileTransfer::UploadFiles: sent
TransKey=1#4a0a8e775db1bdb2655d5c35
5/13 19:27:23 entering FileTransfer::Upload
5/13 19:27:23 entering FileTransfer::DoUpload
5/13 19:27:23 DoUpload: send file vmware.log
5/13 19:27:23 Received GoAhead from peer to
send /vmexecute/execute/t17-1d-10/dir_12815/vmware.log.
5/13 19:27:23 Sending GoAhead for 192.168.10.7 to
receive /vmexecute/execute/t17-1d-10/dir_12815/vmware.log and all
further files.
5/13 19:27:23 ReliSock::put_file_with_permissions(): going to send
permissions 100644
5/13 19:27:23 put_file: going to send from
filename /vmexecute/execute/t17-1d-10/dir_12815/vmware.log
5/13 19:27:23 put_file: Found file size 1010653
5/13 19:27:23 put_file: sending 1010653 bytes
5/13 19:27:23 ReliSock: put_file: sent 1010653 bytes
5/13 19:27:23 DoUpload: send file vmware-0.log
5/13 19:27:24 Received GoAhead from peer to
send /vmexecute/execute/t17-1d-10/dir_12815/vmware-0.log.
5/13 19:27:24 ReliSock::put_file_with_permissions(): going to send
permissions 100644
5/13 19:27:24 put_file: going to send from
filename /vmexecute/execute/t17-1d-10/dir_12815/vmware-0.log
5/13 19:27:24 put_file: Found file size 3034
5/13 19:27:24 put_file: sending 3034 bytes
5/13 19:27:24 ReliSock: put_file: sent 3034 bytes
5/13 19:27:24 DoUpload: send file vmj9mEQU_condor.vmx
5/13 19:27:24 Received GoAhead from peer to
send /vmexecute/execute/t17-1d-10/dir_12815/vmj9mEQU_condor.vmx.
5/13 19:27:24 ReliSock::put_file_with_permissions(): going to send
permissions 100770
5/13 19:27:24 ReliSock: put_file: Failed to open
file /vmexecute/execute/t17-1d-10/dir_12815/vmj9mEQU_condor.vmx, errno =
13.
5/13 19:27:24 DoUpload: exiting at 2381
5/13 19:27:24 DoUpload: (Condor error code 13, subcode 13) STARTER at
192.168.10.109 failed to send file(s) to <192.168.10.7:9709>: error
reading from /vmexecute/execute/t17-1d-10/dir_12815/vmj9mEQU_condor.vmx:
(errno 13) Permission denied; SHADOW failed to receive file(s) from
<192.168.10.109:9610>
5/13 19:27:24 JIC::transferOutput() failed, waiting for job lease to
expire or for a reconnect attempt
5/13 19:27:24 Got SIGQUIT.  Performing fast shutdown.
5/13 19:27:24 ShutdownFast all jobs.
5/13 19:27:24 Got ShutdownFast when no jobs running.

by
Johnson



Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com