Re: [Condor-users] disable check pointing

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

On Mon, May 7, 2012 at 6:36 PM, John (TJ) Knoeller <johnkn@xxxxxxxxxxx> wrote:

What is RequestMemory for your job? how big does ImageSize become after the first update?

On 5/7/2012 2:50 PM, Tiago Macarios wrote:
Hi,

I have been struggling with a problem the whole day. It is probably something stupid, but I would really appreciate some light.

I have this computer (32 cores) that is a dedicated pool, we use it to process simulations. Today someone submitted a simulation that needs to read and write loads of tiny files and it caused the computer to go almost idle due to the disk bottleneck. This computer has 64 GB ram, so I figure I would get 20GB as a ramdisk and things would work as they should. The problem is that after the jobs update their ImageSize for the first time they just go to the IDLE state and I get:

013.029: Run analysis summary. Of 64 machines,

64 are rejected by your job's requirements

0 reject your job because of their own requirements

0 match but are serving users with a better priority in the pool

0 match but reject the job for unknown reasons

0 match but will not currently preempt their existing job

0 match but are currently offline

0 are available to run your job

Last successful match: Mon May 7 19:31:42 2012

WARNING: Be advised:

No resources matched request's constraints

The Requirements _expression_ for your job is:

( ( target.OpSys == "LINUX" ) && ( TARGET.Disk >= 0 ) ) &&

( TARGET.Arch == "X86_64" ) && ( ( TARGET.Memory * 1024 ) >= ImageSize ) &&

( ( RequestMemory * 1024 ) >= ImageSize ) && ( TARGET.HasFileTransfer )

Job ClassAd Requirements _expression_ evaluates to false

I figure it is something to do with ( TARGET.Memory * 1024 ) >= ImageSize, how can I change it? I dont really care about check pointing, I just need the end result and if something fails I will restart it from beginning. Can I disable check pointing somehow in the vanilla universe? FYI: The jobs do not use much memory.

Thanks,

Mac.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

Mailing List Archives

Authenticated access

Re: [Condor-users] disable check pointing