The job is running out of memory because it is only requesting 2Gb of RAM but then using more than that.
SLOT_TYPE_1_PARTITIONABLE=TRUE
Means that a slot with the amount of cpus and memory requested by the job will be created when AP decides to run that job, up to a maximum of 8 CPUs and 4 GB, because
SLOT_TYPE_1=cpus=8, memory=4096
To fix this, you need to change the request_memory of the job's submit file to request more memory
-tj
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Mihai Ciubancan <ciubancan@xxxxxxxx>
Sent: Friday, May 30, 2025 2:28 AM To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: [HTCondor-users] problems with jobs requiring more then 2GB memory Hello,
I encounter issues with LHCb jobs ,which are requiring more than 2GB per jobs. The jobs are failling with the following error: LastHoldReason = "Error from reserved-LHCb2_5@xxxxxxxxxxxxxx: Job has gone over cgroup memory limit of 2048 megabytes. Last measured usage: 2033 megabytes. Consider resubmitting with a higher request_memory." I have configure partionable slots: CLAIM_WORKLIFE=3600 CONTINUE=TRUE JOB_RENICE_INCREMENT=10 KILL=FALSE NUM_SLOTS=4 NUM_SLOTS_TYPE_1=4 SLOT_TYPE_1_PARTITIONABLE=TRUE SLOT_TYPE_1=cpus=8, memory=4096 SLOT_TYPE_1_START=Owner=="pillhcb01" SLOT_TYPE_1_NAME_PREFIX=reserved-LHCb PREEMPT=FALSE RANK=0 SUSPEND=FALSE SLOT_TYPE_1_CONSUMPTION_POLICY=False CONSUMPTION_POLICY=False CLAIM_PARTITIONABLE_LEFTOVERS=False Also is enable cgroup policy: BASE_CGROUP = /system.slice/condor.service CGROUP_MEMORY_LIMIT_POLICY = soft MAXJOBRETIREMENTTIME = $(HOUR) * 24 * 7 SYSTEM_PERIODIC_REMOVE = ResidentSetSize > 3000*RequestMemory If you have any suggestion will be highly appreciated! Best, Mihai _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!K29kgDu3KqY-v0JvPE9cVXxO9hKbX4vVgC2pMuc85_5TCTwv4huZH_KU-ElZEvUc6BvAtLM_1S1Sk8MicXaY$ The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/ |