Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Local storage on a condor computer node.

Date: Fri, 13 Jan 2023 11:53:45 -0600 (CST)
From: Todd L Miller <tlmiller@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Local storage on a condor computer node.

Is there a way for condor to take advantage of that? Mostly people are
using nfs for their data files so far.

Shared filesystems can very quickly become bottlenecks forclusters, so past a certain size, it's frequently necessary to copy jobs'input files to local storage (and their output files from local storage).

But is there a practical way to transfer very large data files through
condor and use them from local storage on the remote condor node?

Unfortunately, the answer really is "it depends." While Condor'sfile transfer is rather efficient (as a protocol), whether or not it makessense for a job to access its input and output files via the localfilesystem really depends on the properties of the job.

If the job only accesses (reads or writes) a small fraction of thedata in its (large) files, it will be more efficient to access those filesvia NFS (or another shared filesystem that does block-level transfers).

If a job reads a substantial fraction of the data in its (large)input files -- and doesn't modify those files -- it will frequently bemore efficient to access those files via some sort of horizontally-scalingcaching system; we use squid as an HTTP cache here. This obviously worksbetter if the (large) input files are used by multiple jobs.

For other (small) files, we generally recommend using HTCondorfile transfer, as this is efficient enough and keeps load off of theshared filesystem that would be better used for files in the first case.


	For a user-facing explanation, see

https://chtc.cs.wisc.edu/uw-research-computing/file-availability.html

- ToddM

PS: in the preceding, I mentioned efficiency in a few different places,

which may be a little deceptive; what we care about is not absoluteefficiency, but job throughput. For instance, it's usually easier toadd more squid caches than it is to add more NFS servers; the caches maybe less efficient in an absolute sense, but since you can add more caches,the overall throughput of the system goes up.

Follow-Ups:
- Re: [HTCondor-users] Local storage on a condor computer node.
  - From: Todd L Miller

References:
- [HTCondor-users] Local storage on a condor computer node.
  - From: Amy Bush

Prev by Date: [HTCondor-users] gethtcondor script problems
Next by Date: Re: [HTCondor-users] Local storage on a condor computer node.
Previous by thread: Re: [HTCondor-users] Local storage on a condor computer node.
Next by thread: Re: [HTCondor-users] Local storage on a condor computer node.
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Local storage on a condor computer node.