[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] problems with Dagman status files



Thanks Cole, clear as usual !

Indeed I had recently removed ALWAYS-UPDATE to minimize I/O on our AP's,
I surely misunderstood what "had not changed" could mean and will add it back.

Stefano

On 29/05/2025 16:28, Cole Bollig wrote:

  1. When DAGMan goes to write the final node status on the way out the door, it checks if its state has changed before actually writing the node status file. In this case it appears that DAGMan has determined that its state had not changed since the prior writing of the DAG node status file. If you want to have DAGMan to re-write the node status file at exit regardless of state change, you can add the ALWAYS-UPDATE keyword to the declaration in your DAG file.
  2. Points 2 & 3 are due to the fact that DagStatus means different things in the node status file and the metrics file. This discrepancy is something that has sadly existed for a long time, and I don't know the best way to address it.
    1. For the node status file, the DagStatus is equivalent to the value list you provided from our documentation.
    2. For the metrics file, the DagStatus is set to the value of the DAGMan jobs ClassAd attribute DAG_Status.