Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] My problems
- Date: Fri, 18 Feb 2005 00:20:52 -0600
- From: Erik Paulson <epaulson@xxxxxxxxxxx>
- Subject: Re: [Condor-users] My problems
On Tue, Feb 15, 2005 at 02:59:17AM -0800, Yenke Blaise wrote:
> I'm using Condor and I'm facing somme difficulties
> that could be resumed as follow :
> 1) After how many seconds is a checkpoint initiated
> while running an application in Condor?
It's configurable, and defined by PERIODIC_CHECKPOINT expression.
Note that it's set on the EXECUTE machine, not the submit machine,
so some machines can be set to create a periodic checkpoint of
the jobs running on it more or less frequently than others.
> 2) Did Condor give the possibility to a user to
> checkpoint his running program at certain moment? In
> other words what can a user do to checkpoint is
> running program?
Send yourself a SIGUSR2, or call ckpt(). See
http://www.cs.wisc.edu/condor/manual/v6.6/4_2Condor_s_Checkpoint.html
> 3) I'd to know what kind of informations ar kept in
> the .ckpt file?
>
The memory image of the process, signal state, and a list of open file
descriptors and the state of those descriptors. See the Dr Dobbs
article from 97, or the Litzkow/Solomon paper:
http://www.cs.wisc.edu/condor/publications.html#checkpoint