Hi Todd, many thanks for the link! Probably extending history/logs is the most reasonable way - three days of reaction time may be to short for all parties involved ;) Cheers and thanks, Thomas ps: Normally we are having grid jobs, i.e., pilots so re-runs should be no problem. However, in this case there was a problem upstream causing a bit of confusion. On 2016-04-05 22:14, Todd Tannenbaum wrote: > > Hi Thomas, > >>From reading the above, is your desire that your job never gets re-run by HTCondor even > in the event of failures? > If so see > https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToAvoidJobRestarts > This wiki page also lists out all the typical reasons why HTCondor will automatically > restart a job; be aware that by default HTCondor alone will not restart a job that > exits successfully, even if it exits with a non-zero exit code. > > As for will the rerun job have the same job id: yes it will, unless you are > using DAGMan -- failed nodes in DAGMan are resubmitted and thus will have a new job id. > > As for where you can look since your history file rotated: did the > job specify a job event log via "log = /some/file" in the submit file? If so > you could look there. You could also grep the schedd log for the job id, but > guessing that the SchedLog already rotated. Finally, if you define "EVENT_LOG = /some/file" in > the condor_config on your submit node, you could look there. > > But you likely want to increase the size specified via config knob MAX_HISTORY_LOG. :) > > Hope the above helps > Todd
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature