HTCondor 7.8.7 on submit machine (Windows)
I have submitted 5 different DAGs on the same submit machine. Some of these DAGs are completing and a rescue file was generated. I then submit those DAGs with failures, but no jobs run. In the DAG log I am told that the job ID in the userlog does not match the previously reported ID:
ERROR: node j806: job ID in userlog submit event (917.0.0) doesn't match ID reported earlier by submit command (1099.0.0)! Aborting DAG; set DAGMAN_ABORT_ON_SCARY_SUBMIT to false if you are *sure* this shouldn't cause an abort.
I could combine all these into a single DAG and throttle the maximum number of jobs, but I did not do this. Is this behavior intended or is it possibly a bug?
Thank you for your help, Mike |