Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Deployment architecture advice

Date: Fri, 14 Apr 2017 10:45:38 -0500
From: Dimitri Maziuk <dmaziuk@xxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Deployment architecture advice

On 2017-04-13 19:10, Ivo Cavalcante wrote:

1. Our software used to prepare the datasets to be processed directly on shared filesystem (NAS), what
used to take a long time. So we've changed to using workstations local disks on the process of generating
datasets, what gave a great improvement on time spent. OTOH, we had to move this data into a place where
execution nodes could see them, and decide to use their local disks also - since shared NAS could be a
bottleneck again.

At one point we were preparing the search dataset on an ssd to deal withthe first bottleneck. If you're copying it to worker nodes afterwards,there is no reason to prepare in on a network share.

I tried various ways to push it out to worker nodes once ready, and sofar I failed to come up with a good way to make a node advertise "I havethe complete and up-to-date dataset and can run jobs that need it". :(


FWIW
Dimitri

References:
- [HTCondor-users] Deployment architecture advice
  - From: Ivo Cavalcante

Prev by Date: [HTCondor-users] 9th International Conference on Computational Collective Intelligence (ICCCI 2017): Last Call for Papers
Next by Date: [HTCondor-users] Preliminary HTCondor Week schedule available
Previous by thread: [HTCondor-users] Deployment architecture advice
Next by thread: Re: [HTCondor-users] Deployment architecture advice
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Deployment architecture advice