On Thu, 24 Jun 2010, Ian Stokes-Rees wrote:
Can we help ourselves by having our 100k node DAG divided into, say, 100 sub-DAGs of 1000 nodes each, and have the dependencies setup so 10-20 of these are active at any time? TIA for advice on DAG construction and condor_dagman management.
Yes, if you divide your overall workflow in such a way, you will reduce your peak memory usage. (Each sub-DAG is a separate instance of condor_dagman, and the condor_dagman isn't created until you get to that point in the top-level DAG.)
You don't need to actually add dependencies to your DAG to limit concurrency -- you can use maxjobs, or, probably better, category throttles.
For example: subdag external d1 foo.dag category d1 nested subdag external d2 bar.dag category d2 nested ... maxjobs nested 10 Let us know if you have further questions. Kent Wenger Condor Team