[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-devel] Bug in Checkpoint when WantCheckpoint is False
- Date: Fri, 1 Apr 2005 14:10:23 -0600
- From: Peter Keller <psilord@xxxxxxxxxxx>
- Subject: Re: [Condor-devel] Bug in Checkpoint when WantCheckpoint is False
Hello,
I'll take a closer look at this patch's context on monday and see if
I can put it into the enxte developer's release.
Thank you.
-Pete
On Fri, Apr 01, 2005 at 01:55:16PM -0600, Daniel Forrest wrote:
> I have found a bug in Checkpoint() that occurs when a checkpoint is
> requested for a job that has 'WantCheckpoint = False'.
>
> When such a job is to vacate, it is sent SIGTSTP (checkpoint and exit).
> The checkpoint will fail (because CKPT_MODE_ABORT is set), but the
> job does not exit. It will then be killed 10 minutes (MaxVacateTime)
> later, but this is a waste of time.
>
> The following patch addresses this. Comments?
>
> --- condor_ckpt/image.C.SAVE Fri Feb 25 14:41:59 2005
> +++ condor_ckpt/image.C Thu Mar 17 15:44:49 2005
> @@ -1668,6 +1668,11 @@
> if (mode&CKPT_MODE_ABORT) {
> dprintf(D_ALWAYS,
> "Checkpoint aborted by shadow request.\n");
> + if (check_sig == SIGTSTP) {
> + dprintf( D_ALWAYS, "Ckpt abort\n" );
> + SetSyscalls( SYS_LOCAL | SYS_UNMAPPED );
> + Suicide();
> + }
>
> // We can't just return here. We need to cleanup
> // anything we've done above first.
>
> --
> Daniel K. Forrest Laboratory for Molecular and
> forrest@xxxxxxxxxxxxx Computational Genomics
> (608) 262 - 9479 University of Wisconsin, Madison
> _______________________________________________
> Condor-devel mailing list
> Condor-devel@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-devel