[HTCondor-users] Image processing with HTCondor

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

I’m pretty new to Condor and I’m trying to understand the best approach for our application. We have a need to process thousands of images through a series of algorithms to do things like feature extraction. These algorithms can be and have been represented in a DAG like this:

# DVF.DAG

JOB ProcessingArea PA.condor

JOB EDMS0 EDMS0.condor

JOB QVT QVT.condor

JOB MMlnD MMLND.condor

JOB PCFF PCFF.condor

JOB EDMS1 EDMS1.condor

JOB DLP DLP.condor

PARENT ProcessingArea CHILD EDMS0 EDMS1 QVT MMlnD PCFF

PARENT QVT MMlnD PCFF CHILD DLP

DOT dvf.dot

See attached for the diagram. I’ve been doing a lot of reading lately trying to figure out the best (or good enough) approach to our application. One change I’ll be making is to make use of the VARS syntax to create a single submission file for the DAG since each algorithm is implemented in the same executable and only one or two command line arguments vary between algorithms.

We need to run these seven algorithms over each image, these images are all in separate directories so I’m trying to figure out how others approach this. I thought I’d be able to use something like the flexible queue command to iterate over each image but my reading through the mailing list archive explained why this isn’t support with DAGman. At this point the only thing I’ve figured is to write a script to create unique DAG files for each image and then either submit each DAG file individually or wrap all of the individual DAGs into a “master” DAG as SUBDAGs.

I guess I’m ultimately asking for pointers or what approaches have others used in situations like this?

-Sean Milligan

Mailing List Archives

Authenticated access

[HTCondor-users] Image processing with HTCondor