HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] starter wisdom: what happens when a job exits?



As per previous threads on this list, I've spent a lot of time in the past few months adding/changing a ton of code in the startd and starter so that there's now a system of hooks in place at various stages of a job's life-cycle on the execution machine. One of the things I had to do was come to grips with what the starter actually does when a job exits (which was much more challenging than it sounds). ;)

As a public service to the Condor development community (and to help myself remain sane while I had to change some of it), I wrote up a fairly detailed but high-level explanation of what's going on in the starter code whenever a job exits. The results live in the "WISDOM" file in the src/condor_starter.V6.1 directory (in HEAD). The write- up is an explanation of how things are in 7.1.* and beyond with the hooks in place -- I didn't bother to write it all up as the old work- flow with the older (if can believe it, more insane) names, etc.

If you're interested in this topic and have any questions, please ask me sooner rather than later so I can edit the document while a) I still have time, and b) it's still mostly swapped into my working set.

Cheers,
-Derek