HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] file tranfer type



On Mon, Jan 29, 2007 at 04:02:49PM -0600, Jiansheng Huang wrote:
> 
> Hi,
> 
> I work on the new version of quill (quillpp as we call it). After the
> schema meeting with Todd and Greg, we decided to add a column to the
> transfers table to record the transfer type. But I am not sure what data
> type is the best to use for the column? a string or a number? Would anyone
> with expertise on file transfering throw in some suggestions here?

i wasn't at the meeting, so i can't really comment on the intended meaning for
the "transfer_type."  i do have some other suggestions, which maybe you already
discussed, or are possibly overkill, but updating the schema is (hopefully)
not something you do often so you may wish to include these extra fields now
even if they are not being used.


> here is the schema for the transfer table:

comments below:


> CREATE TABLE  transfers (
> globaljobid  	varchar(4000),
> src_name  	varchar(4000),
> src_host  	varchar(4000),
> src_port	integer,
> src_path 	varchar(4000),
> dst_name  	varchar(4000),
> dst_host  	varchar(4000),
> dst_port        integer,
> dst_path  	varchar(4000),

how about adding two strings, src_protocol, and dst_protocol?  sure, we
don't really support 3rd party transfers at the moment, but stork does,
and it's not inconceivible that we'd use stork for file transfer.  and
it's even more likely that we'd support sending local files (src_protocol ==
"file") to remote execution nodes via a number of different protocols, e.g.
(dst_protocol == "cedar"), (dst_protocol == "gsiftp"), etc.

furthermore, if you support some of these methods, you'll need a credential.
ftp has a username and password, gsiftp has a certificate, etc.  maybe you
should add a src_credential and dst_credential BLOB or whatever that each
protocol could use to store it's authentication information in whatever
format it needs.


> transfer_size_bytes   numeric(38),
> elapsed  	numeric(38),
> src_daemon      varchar(30),
> dst_daemon  	varchar(30),

no comment, really...


> checksum    	varchar(32),

IMHO, that is too small.  md5 is already 32 bytes when written out in
human-readable hexadecimal.  sha1, ripemd, and future checksums are/will
be bigger.


> tranfer_time	timestamp(3) with time zone,
> last_modified	timestamp(3) with time zone,
> transfer_type   ?

unix file permissions?  uid/gid/username of owner?

also, when transferring x509 credentials, they are not actually transferred but
delegated over the wire.  you might wish to have a "delegation_protocol_id"
which is 0 (undefined) in the case of normal files which require no delegation,
but otherwise contains the version of the protocol used to delgate the proxy
(of which there is currently only 1, but could change as proxy files change,
 e.g. with extended attributes like VOMS has)

there's also the flag for whether or not encryption is required for this file.
it assumes you can establish a secure channel if needed, so it's just a yes/no
flag, and doesn't include any crypto keys.


like i said, i don't know if you have been over this or not, but those are my
suggestions off the top of my head.  i know in condor we had to go back and
add a bunch of those, so you might want to support them too, or at least leave
yourself the necessary fields should you wish to do so in the future.


cheers,
-zach (who has apparently spent too much time in file_transfer.C)