Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] enforcing job memory limits?
- Date: Fri, 16 Nov 2007 16:57:53 -0600
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] enforcing job memory limits?
Jonathan D. Proulx wrote:
Hi,
it appears as though a condor job in my flock ehausted the memory on a
user workstation over night ($CondorVersion: 6.8.2 Oct 12 2006 $
$CondorPlatform: X86_64-LINUX_RHEL3 $). This triggered Linux's OOM
killer which killed several desktop apps and sshd on the system.
this seems a prety serious violation of the do no harm principle and
I'm a bit surprized.
is the a config setting I need to tweak on the workstations?
You could add a clause to your preempt expression, i.e. something like
PREEMPT = ( whatever was there before ) || (ImageSize > (Memory-20))
This should work at least in the v6.9 series (and maybe in v6.8 as well?
cannot recall offhand), since the condor_startd's value for "ImageSize"
will be updated several times a minute to the total memory usage of the job.
-Todd
--
Todd Tannenbaum University of Wisconsin-Madison
Condor Project Research Department of Computer Sciences
tannenba@xxxxxxxxxxx 1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132 Madison, WI 53706-1685