[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Max Runtime Limit



Thank you Nickolai and John. I had this question because I was trying to understand why some of the jobs in our HTCondor cluster were having a high JobRunCount. I spent sometime understanding the fair share system of HTCondor and the related user priority handling and realized that the jobs might be getting vacated for ensuring fair share of the resources. I am attempting to get the numbers correct though.

Thanks again.

Regards,
Sandeep

On 08/10/24 19:11, John M Knoeller via HTCondor-users wrote:
This is correct. There is no default in the AP for all jobs, but you can use a JOB_TRANSFORM in the AP to set one of these attributes for jobs as they are submitted.

-tj
------------------------------------------------------------------------
*From:* HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Belakovski, Nickolai via HTCondor-users <htcondor-users@xxxxxxxxxxx>
*Sent:* Monday, October 7, 2024 3:32 PM
*To:* HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
*Cc:* Belakovski, Nickolai <nickolai.belakovski@xxxxxxxx>
*Subject:* Re: [HTCondor-users] Max Runtime Limit
I think you're looking for allowed_execute_duration or allowed_job_duration, see the reference here: https://htcondor.readthedocs.io/en/latest/man-pages/condor_submit.html <https://secure-web.cisco.com/1b12QZtCs5MaXbvLxgoq2XeZYJflfiDrgqkIIfHWX-gER_GM_DhHAmccoOg8Y7H-Z4KdpbyTtw9ew-tSzLKFgt5uJyU9D_YL7_ZtyaCY8Q8GAI6u1-Fdn6C8OZuvFEjIrg_XRmbTXCPd7k_KnyOIpdPO50fYe8jF9TnJoO-nGDAG57QcpNmIpDuS8hZ-Zi4kA0OeNCXgOu0h2H6g6GI9BXEkmhnwkE79sPyT6U0xup3q3_PR1UDv-T-_f1jG8wFMhuzJkI1iUzuHLVCIhkQMs3majTAxIGE_AS4wvXa0PmgUfpeC5pug7vv76fHF95k6K72Cf1wgOFH0Q3tZU9s7-dXJ14PA58yIk6SL2obIdNGfwC4C7ztHY-bsAIgkzijmthg0C3B7PXoryrVsnskwCng/https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fhtcondor.readthedocs.io%2Fen%2Flatest%2Fman-pages%2Fcondor_submit.html__%3B%21%21Mak6IKo%21JdCa4tMahE1UJjObrT7j94pd3pnt2DeIYI_Eiyhdjj4Aubcqbmw_t_B5a75N8--k8p12iiqvDAnqr23fWkaW2-5hK3XTl0s%24>

I think the main difference between them is that if a job exceeds its allowed_execute_duration then it goes into hold and it can't be put back into a run state, whereas allowed_job_duration resets when the job leaves the running state. So I guess job_duration would be better because if you have a job that it legitimately taking a long time and not just spinning, you could kick it and let it keep going. But I'm not sure as I'm only just learning these commands so I don't have practical experience with them yet.

Nickolai
------------------------------------------------------------------------
*From:* HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Sandeep K. Joshi via HTCondor-users <htcondor-users@xxxxxxxxxxx>
*Sent:* Monday, October 7, 2024 10:47 AM
*To:* htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
*Cc:* Sandeep K. Joshi <skjoshi@xxxxxxxx>
*Subject:* [HTCondor-users] Max Runtime Limit
USE CAUTION: External Message.

Hello All,

Is there a htcondor config default value (system side, not specified by users in their jobscript) for the maximum time a job is allowed to run ?
--
Regards,
Sandeep

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://secure-web.cisco.com/1wXSXDIuDGveP-MeTgrDzbeqQmOksIiXjOqIzEIQQ2kchxUgohyGTJXzA9CO57_OZ4Aohca_wfLHKF655ytG-2pbjfbEsgahF-Qmc-qsM1rJ4jFv8TWRW6sZz0XFuXRaPtvC-MsoxmZeruPSUepfsEBueRiUzvZUjkc16hpDgJ5jYXnR_zte_GJTs6pAHyGxsp9yy6kk0d8rUSylQ9Ki87zRjR77lYGHovybXpyK5wXI5yjGOq9o6F-BueD7JBBFZFTPRRmHP9RDMPAInvL2E9IWpJ2TrdjVcXGvCwWIVsC4Fu9xzpSkBzSmJugruDgFkP84ZyePtgH2YiQVFbhQRBEHDijmkrUtHZJPKXW6WWVWF10My3BntXVvmDsCIpftnNJOCdcKH8aQJNvba1CBaIQ/https%3A%2F%2Flists.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users

The archives can be found at:
https://secure-web.cisco.com/12LeIIdXFH6abpfa-jwxJr62mHyTLWt8To-ELwxurA9ISyFnmCWIeDvCDxcikMnwkATAddTJ3lqKpbGE3s5tymjXM8-m9vlch2mOKKoQCN0kw8qqmHhm0R82skIjFDpgcl--B0x4sQQ9tS7xIJJcuAUn3q1TtSWUnRRIYQUbBPHDWDmwiPmYNxFpabHopBcux3CCyAczYcfIDO6AwX5ddR4CWtzQrS0f52eVFUgZYTF9klOb2kjYMvQA-4H75WaTur6aXmVyiLwB51fR0sBvFXEgLwllKGF9h7qMR6h2_OPdbaeMwDQ-PPy3CunwPZ83ECbRiQ_oW5evQPyzgjF-gw2lftRu06yjcxJ362drOLW_F-HcaQJxleOBYU9qmrGFwpa_-coWFf_6hEOoM6bAL9A/https%3A%2F%2Flists.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F

--
Regards,
Sandeep K. Joshi