On 2025/11/21 5:02 am, gagan tiwari wrote:
> Hi Guys,
>Â Â Â Â Â Â Â Â Â Â Â ÂAnyone has any ideas / adviceÂon this?ÂPlz letÂme know
> Thanks,
> Gagan
Hi Gagan,
Just my random thoughts ...
At ESAT we've been using fscache (NFS client side caching) since many years now here,
And indeed, the data 'transfer' (not in HTCondor sense) for the first job to access
larger data sets is significantly longer, as it primes the cache. So it would
make sense to improve the RANK of the compute node to attract similar jobs.
However, I've always seen this is unpractical, for several reasons:
- HTCondor has zero knowledge over what data a job using NFS really really
 Âhas read. One of the big disadvantages of using NFS...
- One could somehow collect mounts and try to make some sense out of it, but ...
 ÂIt doesn't. A job immediately crashing might have mounted, but never primed
 Âthe cache. The cache has its own policy to get stuff out; the cache is
 Ânot (easily)Ârevealing exactly what files are in there... Some jobs mount
 Âstuff from all over the place for just a small config file. Some jobs
 Âaccess only fraction of bigger data sets, which means it doesn't end in the
 Âcache at all...
- So the only way to infer this knowledge is using information from the job
 Âdescription; that could be relatively easily done I think, but would
 Âagain be hit and mis...
 ÂIn that case, I'd have the user specify some parameter in the JDF, like
 ÂNeeds_NFS_XYZZY = True; ÂThe machine would then run such job and set
 ÂHas_seen_NFS_XYZZY = <seconds ago>. And then modify the RANK _expression_
 Âso that machines with a low value for Has_seen_NFS_XYZZY attracts jobs
 Âwith Needs_NFS_XYZZY.
 ÂBut again, at least in our situation, it's not worth the effort. Especially
 Âas near-identical jobs would be 'pipelined' to certain machines anyhow.
But if you only deal with a very well known set of jobs, it might be different...
You could even make a dummy job just to prime the cache, if nothing else can interfere.
Chears, B.