On Feb 11, 2008, at 11:09 AM, P. A. Cheeseman wrote:
I'm concerned with how to determine whether or not a checkpoint message, whether it be the one prefixed with the 003 code or one which is embedded in eviction information, indicates conclusively whether or not a job remains executing. My first, I believe erroneous, impression was that a job ceasedexecution upon checkpoint but I later realized that periodic checkpointsor application initiated checkpoints would leave the job in execution.It's occurred to me that the 003 checkpoint message may always leave a job executing while the checkpoint message embedded in an eviction mayindicate that a job has left execution. Is it that simple?
The job evicted log event itself means that the job has stopped executing, but didn't complete. The checkpoint message in the evicted event states whether the job was checkpointed immediately before it was killed. The job checkpointed event means that a periodic checkpoint occurred, and the job continued to run.
+--------------------------------+-----------------------------------+ | Jaime Frey | I used to be a heavy gambler. | | jfrey@xxxxxxxxxxx | But now I just make mental bets. | | http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind. | +--------------------------------+-----------------------------------+