Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Downloading big files is interrupted
- Date: Thu, 9 Jan 2014 10:42:53 -0600
- From: Daniel Forrest <dan.forrest@xxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Downloading big files is interrupted
On Thu, Jan 09, 2014 at 08:42:00AM +0000, Leon Thielen wrote:
> Hi,
> we running HTCondor version 8.1.2.
> Condor master and client are Windows 7 machines.
> Submit host is the master
> All the files will reside on a Linux a machine. Linux and windows are connected via samba.
>
> transfer_input_files from big files will be interrupted after reading a couple of bytes. If running a job with a small input file (15,294,016) it works.
> Running with a bigger file (9,195,290,624) we get
> get_file(): ERROR: received 605356032 bytes, expected 9195290624!
> Running job with an even bigger file we get
> get_file(): ERROR: received 902758400 bytes, expected 30967531520!
>
> StarterLog.slot1_1 :
>
> 01/08/14 09:59:26 setting the orig job name in starter
> 01/08/14 09:59:26 setting the orig job iwd in starter
> 01/08/14 09:59:26 Chirp config summary: IO false, Updates false, Delayed updates true.
> 01/08/14 09:59:26 Initialized IO Proxy.
> 01/08/14 09:59:26 Setting resource limits not implemented!
> 01/08/14 10:00:13 condor_read(): timeout reading 65536 bytes from <10.10.20.209:55472>.
> 01/08/14 10:00:13 ReliSock::get_bytes_nobuffer: Failed to receive file.
> 01/08/14 10:00:13 get_file(): ERROR: received 902758400 bytes, expected 30967531520!
> 01/08/14 10:00:14 DoDownload: STARTER at 10.10.20.65 failed to receive file C:\condor\execute\dir_2636\reference-big.zip
> 01/08/14 10:00:14 File transfer failed (status=0).
> 01/08/14 10:00:14 ERROR "Failed to transfer files" at line 2120 in file c:\condor\execute\dir_27920\userdir\src\condor_starter.v6.1\jic_shadow.cpp
> 01/08/14 10:00:14 ShutdownFast all jobs.
> 01/08/14 10:00:14 condor_read() failed: recv(fd=1064) returned -1, errno = 10054 , reading 5 bytes from <10.10.20.209:55479>.
> 01/08/14 10:00:14 IO: Failed to read packet header
>
> Can somebody help me too solve this issue?
It looks like the file size is limited to a 32 bit number:
$ printf "%09x\n" 605356032 9195290624
024150000
224150000
^
$ printf "%09x\n" 902758400 30967531520
035cf0000
735cf0800
^
This would appear to be an internal HTCondor limitation.
--
Dan