[HTCondor-users] condor submission: how to force a job to use specified amount of memory?

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Dear all,

I'm doing numerical experiments, solving optimization problems and

collect the log files to compare different algorithms.

The program requires about 12 GB memory to solve a problem.

The machine I am using is a cluster of 27 nodes, and each of them has 12 slots.Â

1 slot has 2 GB of memory.

Following is my current condor submission.

universe = vanilla

notification = never

should_transfer_files = yes

when_to_transfer_output = always

copy_to_spool = false

requirements = regexp("slot([1-9]|1[0-2])@pedigree-([1-9]|1[0-9]|2[0-7]).*",Name)

request_memory = 12000

executable = limit.sh

Â ÂÂ

output = out

error Â= err

log Â Â= log

transfer_input_files = program, input_file

arguments = 22600 12000000 ./program -f input_file --algorithm search

queue

I am submitting 100 ~ 200 jobs at once hoping that condor schedules jobs for me.

It was fine until I was using the memory less than 4 GB for each job.

What I am seeing is:Â

condor assigns each job to a single node, so more than 2 jobs assigned to 1 node.

as the program solves the input_problem it will take more memory.

At some point, some of the jobs become suspended and eventually go idle.

I guess it is due HTCondor try to allocate the resource within a single machine rather than using unclaimed slots.

I confirmed this by submitting small number of jobs and HTCondor didn't use 300 slots available.

I changed above header as,

requirements = regexp("slot([1-9]|1[0-2])@pedigree-([1-9]|1[0-9]|2[0-7]).*",Name) && ( Memeory >= 12000 )

request_memory = 12000

But it didn't resolve this issue.

If someone could suggest a way to modify the condor submission?

Thanks in advance.

Best,

Junkyu Lee

Mailing List Archives