[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Deployment architecture advice



On 2017-04-13 19:10, Ivo Cavalcante wrote:

1. Our software used to prepare the datasets to be processed directly on shared filesystem (NAS), what
used to take a long time. So we've changed to using workstations local disks on the process of generating
datasets, what gave a great improvement on time spent. OTOH, we had to move this data into a place where
execution nodes could see them, and decide to use their local disks also - since shared NAS could be a
bottleneck again.

At one point we were preparing the search dataset on an ssd to deal with the first bottleneck. If you're copying it to worker nodes afterwards, there is no reason to prepare in on a network share.

I tried various ways to push it out to worker nodes once ready, and so far I failed to come up with a good way to make a node advertise "I have the complete and up-to-date dataset and can run jobs that need it". :(

FWIW
Dimitri