HTCondor Project List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] [condor-staff] questions about local universe job exit semantics

Date: Mon, 8 Jan 2007 10:49:21 -0600
From: Jaime Frey <jfrey@xxxxxxxxxxx>
Subject: Re: [Condor-devel] [condor-staff] questions about local universe job exit semantics

On Dec 26, 2006, at 7:15 PM, Derek Wright wrote:

currently, they happen in the above order. so, if you wereunlucky, and things crashed, you could potentially see the jobexited event in the userlog, but the job was left marked running,so the job could run again. this is the desired behavior, since wesay 2 exited events is better than none. you'd also see the job'soutput classad written twice (which would probably break something,i don't know if anything/anyone can handle this case). however, ifyou were unlucky and crashed between (c) and (d), you could havethe job exit without any email notification at all.


Sorry for not replying sooner.

Two exit events are better than none, but an exit event followed byre-execution of the job is bad. I'm surprised Peter hasn't jumped allover this. Given his current opinion of the reliability of the userlog, he may have just given up in disgust. :-)Once an exit event appears in the user log, the job can't re-execute.Otherwise, the user log is worthless as anything other than ahistorical archive. The user can't tell from the user log when theoutput of the job is available for use.


+--------------------------------+-----------------------------------+
|           Jaime Frey           | I used to be a heavy gambler.     |
|       jfrey@xxxxxxxxxxx        | But now I just make mental bets.  |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind.        |
+--------------------------------+-----------------------------------+

Follow-Ups:
- Re: [Condor-devel] [condor-staff] questions about local universe job exit semantics
  - From: Peter F. Couvares

Prev by Date: [Condor-devel] memory management guidelines from Todd
Next by Date: Re: [Condor-devel] [condor-staff] questions about local universe job exit semantics
Previous by thread: [Condor-devel] memory management guidelines from Todd
Next by thread: Re: [Condor-devel] [condor-staff] questions about local universe job exit semantics
Index(es):
- Date
- Thread