I have to run a simulation about a thousand times with
different seeds. The simulation executable and data total about 100MB. This
sounds like a job for DAGMan & Stork, because this 100MB collection of
files needs to get copied around reliably, and some large output files need to
be transferred back to the originating machine reliably. My question: Does Stork and/or DAGMan do anything intelligent
about avoiding recopying files? The input files are identical for all thousand
runs; only the seed varies. But I would like to have Condor manage each run
individually. So does all the data and the executable get copied around a
thousand times, cleaned up after each run? If the thousand reps are child to
the Stork job that transfers files in place, does everything just work with no
extraneous recopying of input data? Thanks, Thomas Rowe |