Re: [HTCondor-users] HTCondor with Fscache

On Fri, Nov 21, 2025 at 3:34âPM Bert DeKnuydt <Bert.DeKnuydt@xxxxxxxxxxxxxxxx> wrote:

On 2025/11/21 5:02 am, gagan tiwari wrote:
> Hi Guys,
>Â Â Â Â Â Â Â Â Â Â Â ÂAnyone has any ideas / adviceÂon this?ÂPlz letÂme know
> Thanks,
> Gagan

Hi Gagan,

Just my random thoughts ...

At ESAT we've been using fscache (NFS client side caching) since many years now here,
And indeed, the data 'transfer' (not in HTCondor sense) for the first job to access
larger data sets is significantly longer, as it primes the cache.Â So it would
make sense to improve the RANK of the compute node to attract similar jobs.

However, I've always seen this is unpractical, for several reasons:

- HTCondor has zero knowledge over what data a job using NFS really really
Â Âhas read. One of the big disadvantages of using NFS...

- One could somehow collect mounts and try to make some sense out of it, but ...
Â ÂIt doesn't.Â A job immediately crashing might have mounted, but never primed
Â Âthe cache.Â The cache has its own policy to get stuff out; the cache is
Â Ânot (easily)Ârevealing exactly what files are in there... Some jobs mount
Â Âstuff from all over the place for just a small config file. Some jobs
Â Âaccess only fraction of bigger data sets, which means it doesn't end in the
Â Âcache at all...

- So the only way to infer this knowledge is using information from the job
Â Âdescription; that could be relatively easily done I think, but would
Â Âagain be hit and mis...

Â ÂIn that case, I'd have the user specify some parameter in the JDF, like
Â ÂNeeds_NFS_XYZZY = True;Â ÂThe machine would then run such job and set
Â ÂHas_seen_NFS_XYZZY = <seconds ago>.Â And then modify the RANK _expression_
Â Âso that machines with a low value for Has_seen_NFS_XYZZY attracts jobs
Â Âwith Needs_NFS_XYZZY.

Â ÂBut again, at least in our situation, it's not worth the effort.Â Especially
Â Âas near-identical jobs would be 'pipelined' to certain machines anyhow.

But if you only deal with a very well known set of jobs, it might be different...
You could even make a dummy job just to prime the cache, if nothing else can interfere.

Chears, B.