Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] How can I check whether my VMware job is really checkpointed?
- Date: Thu, 13 May 2010 07:13:32 -0700 (PDT)
- From: Rob <spamrefuse@xxxxxxxxx>
- Subject: [Condor-users] How can I check whether my VMware job is really checkpointed?
Hi,
I have successfully submitted VMware jobs without checkpointing.
Now I want to check the checkpoint feature, as it is described in the
manual (no checkpoint server is needed).
The master is a linux/Fedora with condor 7.4.2.
All pool PCs are Windows XP, with condor 7.2 and VMware 1.0.
I have changed the submission file such that it also allows
checkpointing, like this:
Universe = vm
Executable = any_name_you_like
Log = vm.log
vm_type = vmware
vm_networking = false
vm_checkpoint = true
vm_memory = 64
vmware_dir = /home/condor/VM
vm_cdrom_files = input.dat
vm_should_transfer_cdrom_files = YES
vmware_should_transfer_files = YES
Requirements = (target.Arch == "INTEL")
Queue
When I run the job, the vm.log Log file has lines like this:
001 (007.000.000) 05/13 17:57:05 Job executing on host: <115.145.228.96:1034>
...
003 (007.000.000) 05/13 20:26:51 Job was checkpointed.
Usr 0 00:00:01, Sys 0 02:27:38 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
68599016 - Run Bytes Sent By Job For Checkpoint
...
004 (007.000.000) 05/13 20:26:59 Job was evicted.
(0) Job was not checkpointed.
Usr 0 00:00:01, Sys 0 02:27:38 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
68599016 - Run Bytes Sent By Job
79956464 - Run Bytes Received By Job
Notice, that it says
"Job was checkpointed."
*and*
"Job was not checkpointed."
Meanwhile I do find the checkpoint files in the spool:
15MB-000001.vmdk
isohrDAAH.iso
nvram
vmBvHAAB_condor-Snapshot1.vmsn
vmbvhaab_condor.vmem
vmBvHAAB_condor.vmsd
vmbvhaab_condor.vmss
vmBvHAAB_condor.vmx
vmware-0.log
vmware-1.log
vmware.log
I'm quite confused by all this.
Is the VMware condor job checkpointed or not?
Also, I don't know where and how I can verify this.
And if it's not checkpointed, why is it not?
If it is checkpointed, why I can't see more evidence of it?
Thanks for your help!
Rob.