Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Issues with transferring files from URLs
- Date: Mon, 03 Nov 2014 14:51:13 +0000
- From: Brian Candler <b.candler@xxxxxxxxx>
- Subject: [HTCondor-users] Issues with transferring files from URLs
A couple of questions, maybe bugs.
(1) Running a personal condor (8.0.7), given the following submit file
(foo.sub) with the input from a URL:
universe = vanilla
executable = /bin/sh
arguments = "'-c' 'cat condor_submit.html'"
transfer_executable = no
transfer_input_files =
http://research.cs.wisc.edu/htcondor/manual/current/condor_submit.html
output = foo.out
error = foo.err
queue
I find that the file is not transferred at all. foo.err says:
cat: condor_submit.html: No such file or directory
The documentation says:
"For vanilla and vm universe jobs only, a file may be specified by
giving a URL, instead of a file name. The implementation for URL
transfers requires both configuration and available plug-in."
but these are indeed present (/etc/condor/condor_config has
FILETRANSFER_PLUGINS which includes /usr/lib/condor/libexec/curl_plugin)
WORKAROUND: I was able to make it work by setting "should_transfer_files
= yes".
However, is this right? Surely a URL should always be fetched,
regardless of whether or not you are in the same filesystem domain,
since URLs don't appear in the filesystem anyway?
(2) Given a unimplemented URL scheme (like "https"), I found a
difference between my test personal condor node and my production condor
node. The former would leave the job idle because of a classAd matching
condition which was never true:
1 ( TARGET.HasFileTransfer &&
stringListMember("https",HasFileTransferPluginMethods) )
but the latter puts the job into a "held" (H) state, saying
Hold reason: Error from slot1@xxxxxxxxxxxxxxxx: STARTER at 192.168.6.42
failed to receive file /var/lib/condor/execute/dir_24716/xxxx.xxxx:
FILETRANSFER:1:FILETRANSFER: plugin for type https not found!
(Aside: if a plugin for https is not present, wouldn't it be better to
abort the job rather than put it into a 'held' state indefinitely, as
this isn't a condition which is likely to fix itself?)
Anyway, I managed to drill down to find the difference, and it turns out
to be different behaviour depending on whether you set
should_transfer_files = if_needed
or
should_transfer_files = yes
I cannot find this behaviour documented anywhere. Looking at
http://research.cs.wisc.edu/htcondor/manual/current/2_5Submitting_Job.html#SECTION00354000000000000000
it says the default value is "should_transfer_files = if_needed" and
this will enable the file transfer mechanism if the machines are in
different filesystem domains. This implies to me that if the machines
are in different filesystem domains this should behave the same as
"should_transfer_files = yes", but actually the generated requirements
expressions are different in these two cases.
You can reproduce it with the following test case:
---- bar.sub ----
universe = vanilla
executable = /bin/sh
arguments = "-c true"
transfer_executable = no
transfer_input_files = https://example.net/
should_transfer_files = if_needed
#should_transfer_files = yes
error = bar.err
requirements = ( TARGET.Machine == "nonexistent" )
queue
--------
Submit with "condor_submit bar.sub" then use "condor_q -analyze <pid>"
[Results with should_transfer_files = if_needed]
The Requirements expression for your job is:
( ( TARGET.Machine == "nonexistent" ) ) && ( TARGET.Arch ==
"X86_64" ) &&
( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) &&
( TARGET.Memory >= RequestMemory ) && ( ( TARGET.HasFileTransfer ) ||
( TARGET.FileSystemDomain == MY.FileSystemDomain ) )
[Results with should_transfer_files = yes]
The Requirements expression for your job is:
( ( TARGET.Machine == "nonexistent" ) ) && ( TARGET.Arch ==
"X86_64" ) &&
( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) &&
( TARGET.Memory >= RequestMemory ) && ( TARGET.HasFileTransfer &&
stringListMember("https",HasFileTransferPluginMethods) )
This to me also seems like a bug, as I was expecting
should_transfer_files = (yes|if_needed) to behave the same when the
nodes are in different filesystem domains. But if it's not, I think it
should be documented accordingly.
To me the correct behaviour would be something like this:
- if any input or output file is a URL, then add
stringListMember("<scheme>",HasFileTransferPluginMethods) to the
requirements
- if should_transfer_files = yes, then add ( TARGET.HasFileTransfer )
- if should_transfer_files = if_needed, then add
( ( TARGET.HasFileTransfer ) ||
( TARGET.FileSystemDomain == MY.FileSystemDomain ) )
Regards,
Brian Candler.