[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Queries regarding reset retries in rescue dag



Hi Stefano,

Running in recovery would do that. Also, depending on the debug level of DAGMan, set via DAGMAN_VERBOSITY, the configuration options may not be printed to the debug log. The needed value to see all that is DAGMAN_VERBOSITY >= 2. Also, I have been wanting to make DAGMan's debug levels similar to actual HTCondor (i.e. Have D_CONFIG, D_PLACEMENT, D_SCRIPT, etc). Perhaps this would be a good excuse to implement this.

-Cole Bollig

From: Stefano Belforte <stefano.belforte@xxxxxxx>
Sent: Wednesday, October 22, 2025 11:09 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: stefano.belforte@xxxxxxx <stefano.belforte@xxxxxxx>; Cole Bollig <cabollig@xxxxxxxx>
Subject: Re: [HTCondor-users] Queries regarding reset retries in rescue dag
 
thanks a lot Cole.

Yeah. I work with Vijay on this, as you may have suspected.

We still haven't been able to get firm evidence that dagman config file
was read,
but after we removed `-DoRecover` from `condor_dagman` arguments
the retry count appears to be reset and Dagman does what we are
expecting it to do.

Looks like at some point in the far past CRAB developers decided to
switch Dagman from Rescue to Recovery mode
https://urldefense.com/v3/__https://github.com/dmwm/CRABServer/commit/c812d1c1a7c5fc1e5d7a5ef9f27c247fde2c7a4f*diff-cc7fafd6621a3816cc74145abaa7220e550bf8933933ab306af23467af7119c4__;Iw!!Mak6IKo!KzuYk3y0z0fj7bPw9iMx3-SrMJWxbosr4aBi0ajs9Eaj1fOlh3si05YVbj7A8tEy2LTbepmw5uLARhYwZnEUigKJtWc8$

We are now trying to switch to Rescue mode instead,  since as discussed
we want to remove the code which hacks Dagman logs and status files.

I think we need to go a bit more along this way before we understand how
to use it. Then we can maybe have a discussion about whether our
strategy makes sense for our goal. IIUC Dagman will still use recovery mode
in case of incidents like schedd restarts, machine reboots etc. That's fine.

Stefano