Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] First go at standard Universe checkpointing
- Date: Mon, 10 Jan 2011 10:48:02 -0600
- From: Daniel Forrest <dan.forrest@xxxxxxxxxxxxx>
- Subject: Re: [Condor-users] First go at standard Universe checkpointing
On Mon, Jan 10, 2011 at 03:58:02PM +0000, Ian Cottam wrote:
> I'm trying to get my first Standard Universe job working, because of
> checkpointing (Condor 7.4.4 RedHat build).
>
> I've replicated the problems I'm having by means of a tiny C program.
> Said problems being...
> - the check point file ends in .tmp and seems far too small; and
> - when I restart the test case it immediately seg faults.
>
> Any ideas?
>
> Test code looks like this----
> #include <stdio.h>
> #include <math.h>
> int main(void)
> {
> int i; double x;
> FILE *f= fopen("r-out.txt", "w");
> fputs("hello Condor Standard Universe - starting\n", f);
> for (i= -500000000; i != 500000000; ++i) {
> x= sqrt(i<0?-i:i); /* kill some time */
> }
> fputs("finished OK\n", f);
> return 0;
> }
>
>
> ---
>
> Compiled with---
> condor_compile gcc cctest.c -o cctest -lm
> ---
> I'm testing by just doing a standalone
> ./cctest
>
> Followed by a control-Z to make it checkpoint and quit.
This is due to address space randomization and the virtual memory
layout (placement of the VDSO). Run it like this:
setarch i386 -R -L ./cctest
And restart it like this:
setarch i386 -R -L ./cctest -_condor_restart cctest.ckpt
This assumes you are running 32-bit, but I would bet you are.
--
Dan