Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Email Error notication
- Date: Sun, 20 Nov 2016 12:36:23 -0600
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Email Error notication
On 11/15/2016 3:04 PM, Uchenna Ojiaku - NOAA Affiliate wrote:
> In the command reference manual it states this concerning error email
> notification:
> "If defined by /Error/, the owner will only be notified if the job
> terminates abnormally, or if the job is placed on hold because of a
> failure, and not by user request".
>
> I've ran multiple jobs, in this job below the log file returned a
> non-zero value yet the job was complete. *How do I get an error email
> notification when there is an "error' with the job, i.e. a non-zero value?*
>
Hi Uche,
As you discovered, when notification=Error, HTCondor sends email when there was an error launching the job (for instance, if the initial working directory or job executable is missing) or if the job exits with a signal.
If you want HTCondor to do something based upon a normal exit status code, you need to explicitly tell HTCondor what exit code(s) is/are considered "success" and, and what codes are considered failure.
In the upcoming HTCondor v8.5.8+ release, things are made more intuitive with the introduction of the "success_exit_code=X" macro in the job submit file. See https://is.gd/vsQvJk
In earlier versions of HTCondor, you can still achieve what you want via the power of ClassAds by replacing your "notification=error" line with one other line, although it is a bit non-obvious. In the HTCondor Manual in Appendix A, there is a list description of many of the job classad attributes, including the attribute JobNotification ( see https://is.gd/PeDlhv ). When you put "notification=complete" in your job submit file, condor_submit sets in the job classad "JobNotification=2", and when you put "notification=error" in the submit file, condor_submit sets "JobNotification=3". All classad attributes be set to be literals (like integers 2, 3), or they can be set to expressions that can use a bunch of functions including conditionals. So to achieve what you want to do, whereby email is sent even if a job runs ok but exits with a non-zero exit code, you can explicitly set JobNotification like the following example in your job submit file:
executable = /bin/bash
# Make notification=complete if ExitCode is non-zero, else make it error
+JobNotification = IfThenElse(ExitCode=!=UNDEFINED && ExitCode=!=0, 2, 3)
# So this job will not send email
arguments = "-c 'exit 0'"
queue
# And this job will send email
arguments = "-c 'exit 1'"
queue
Hope the above helps. I realize the above is non-obvious, which is why we made things easier starting in HTCondor v8.5.8. But I hope the above is instructive re learning about the flexibility/power that ClassAds gives end users and administrators. Details about the ClassAd language is in section 4.1 of the Manual.
regards
Todd