[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] DAG issues with 7.8 (or earlier) -> 8.0 upgrade



We have just found out that there is a problem with doing an HTCondor
7.8 to 8.0.0 upgrade *if you have running DAGs*.  Unfortunately, because
of the default node log file feature added in 7.9.0, an 8.0.0 DAGMan
does not properly recover state from a 7.8 DAG.  This causes the DAGs
to re-start from the beginning after the upgrade, rather than continuing
from where they were.

If you have running DAGs and are going to upgrade from 7.8 (or earlier)
to 8.0.0, we recommend that you condor_rm all running DAGs before the
upgrade, being sure that appropriate rescue DAGs have been written before
actually upgrading/shutting down HTCondor.  Once the upgrade is finished,
re-submit the DAGs, which should then read the rescue DAG files and
re-start from where they were.

We anticipate having a fix to this in 8.0.1, so if the "condor_rm all DAGs" scenario is a big problem, you may want to wait for 8.0.1 before upgrading.

If you are upgrading from 7.9.1 or later to 8.0.0, you shouldn't have a
problem.

Our apologies for not catching this in advance...

Kent Wenger
CHTC Team