Hi Gianmauro,
Thanks for your answer but from what I understand you launch this script
manually right ?
What I would like is finding a way for condor to increase the memory
itself as my jobs are retried automatically.
Best,
Romain
LeÂmer. 2 mars 2022 ÃÂ20:12, <gmauro@xxxxxxxxxxxxxxxxxxxxxxxxxx
<mailto:gmauro@xxxxxxxxxxxxxxxxxxxxxxxxxx>> a ÃcritÂ:
Hi Roman,
I use this script for exactly the purpose you described
It will relaunch the script with 3 times the memory requested until it
reach a cap.
Every relaunch is recorded in a log file.
$ cat /usr/bin/htcondor-release-held-jobs
#!/bin/bash
CAP=524288 # 512GB
MULTIPLIER=3
LOG=/data/dnb01/maintenance/condor_rerun_held_jobs.log
if [ ! -f "$LOG" ]; then
touch "$LOG"
echo "Created $LOG"
fi
for j in $(condor_q -hold -autoformat ClusterId HoldReasonCode| awk
'(($2-34) == 0){print $1}'| paste -s -d ' ')
do
 ÂJOB_DESCRIPTION=$(condor_q "$j" -autoformat JobDescription)
 ÂMEMORY_PROVISIONED=$(condor_q "$j" -autoformat MemoryProvisioned)
 Âif [ $(($MEMORY_PROVISIONED * $MULTIPLIER)) -gt $CAP ]; then
  ÂREQUEST_MEMORY=$CAP
 Âelse
  ÂREQUEST_MEMORY=$(($MEMORY_PROVISIONED * $MULTIPLIER))
 Âfi
 ÂREMOTE_HOST=$(condor_q "$j" -autoformat LastRemoteHost|cut -f2
-d@|cut -f1 -d.)
 ÂDATE_WITH_TIME=$(date "+%d/%m/%Y-%H:%M:%S")
 Â/bin/cat <<EOM >>$LOG
 Â$DATE_WITH_TIME, rerunning held job, id $j, description
$JOB_DESCRIPTION, memory_provisioned $MEMORY_PROVISIONED,
request_memory
$REQUEST_MEMORY, $REMOTE_HOST
EOM
 Âcondor_qedit "$j" RequestMemory=$REQUEST_MEMORY
 Âcondor_release "$j"
done
Hope it helps,
Gianmauro
On 3/2/22 19:48, romain.bouquet04@xxxxxxxxx
<mailto:romain.bouquet04@xxxxxxxxx> wrote:
> Dear all,
>
> I have jobs that I set to be retried automatically by condor in
case of
> failure.
> I was wondering if there is a way for condor to automatically
increase
> the requested RAM for a job in case it failed and it is retried.
>
> I was looking at the NumJobStarts which counts the number of
times a job
> is started
>
https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html
<https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html>
>
<https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html
<https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html>>||
>
> And I was trying to add something as below in the submit file
(but it
> does not work):
> (based on
>
https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#using-conditionals-in-the-submit-description-file
<https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#using-conditionals-in-the-submit-description-file>
>
<https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#using-conditionals-in-the-submit-description-file
<https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#using-conditionals-in-the-submit-description-file>>)
>
>
> if NumJobStarts == 0
>Â ÂÂ request_memory = 2GB
> else
>Â Â request_memory = 8GB
> endif
>
> I could use requirement with a syntax like
> requirement = (NumJobStarts == 0 &&ÂTARGET.Memory >= 2GB) ||
> (NumJobStarts >= 1 &&ÂTARGET.Memory >= 8GB)
> But apparently it is not recommended to request memory that way
>
> Would anyone have a better solution?
>
> Many thanks in advance
> Best,
> Romain Bouquet
> ||
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx
<mailto:htcondor-users-request@xxxxxxxxxxx> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
<https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users>
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
<https://lists.cs.wisc.edu/archive/htcondor-users/>
--
Gianmauro Cuccuru
UseGalaxy.eu
Bioinformatics Group
Department of Computer Science
Albert-Ludwigs-University Freiburg
Georges-KÃhler-Allee 106
79110 Freiburg, Germany
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
<mailto:htcondor-users-request@xxxxxxxxxxx> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
<https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users>
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
<https://lists.cs.wisc.edu/archive/htcondor-users/>
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/