|
Hi Stefano,
The issue is the difference in the amount of state restored between rescue and recovery. In the recovery case the full state the DAG is restored which is what allowed you to do the postscript check for previous executions. For rescue, only the state of successfully
completed nodes and potentially the number of remaining retries for partially executed nodes is restored. Otherwise, the iteration of the DAG is a fresh invocation (thus leading to the number of retries being zero). Currently there is no first-class way in
DAGMan to discern if a node previously executed during a previous DAG execution.
-Cole Bollig
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Stefano Belforte via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Monday, November 3, 2025 10:04 AM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Cc: Stefano Belforte <stefano.belforte@xxxxxxx> Subject: [HTCondor-users] DAGMAN $RETRY macro appears stuck Hi experts (Cole), with ref. to Special DAGMan Macros in https://htcondor.readthedocs.io/en/latest/automated-workflows/dagman-advance-functionality.html#referencing-macros-within-a-definition We use the node retry value $RETRY in our POST script so that it can tell if this Here's the relevant part of a typical DagMan file JOB Job1 Job.1.submit Stefano |