HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Issues with output files and checkpointing




Peter Keller wrote on 2/17/06 5:02 pm:

On Tue, Jan 24, 2006 at 01:11:48PM -0600, Daniel Forrest wrote:
Solution: If the checkpoint
can not be read, reset the LastCkptServer
attribute in the JobAd so
we will not try to read this particular
checkpoint again.  A patch
for this follows:

I'm integrating this patch into Condor for the next release of the
developer series. I just have to check it in, which I'll do monday.


What happens if the ckpt server is down for an hour? Will jobs restart from the start?

Regards,
Todd