We are making excellent progress!
Having to monitor just one file is good news.
My current understanding is that if at time T you observe that the file was updated at time U you would like to check the last update time of the file again at time C = U + 1 hour. If T > C you want to terminate the job.
Miron.
Sent from my iPhone
From: MIRON LIVNY <miron@xxxxxxxxxxx>
Date: 05/26/2016 01:46 PM
> You do not have an algorithm to decide when a job stopped making progress
> based on its Output behavior after it consumed one hour of CPU time.
>
> What am I missing?
Ah, I see what you're getting at now.
Regardless of how much time the job has spent in slot, we can decide
that it is hung and needs to be terminated if it has gone at least one
hour (for example) without making any updates to a particular file.
-Michael Pelletier.
_
|