They are very similar, but one only looks at time spent executing and the other looks at time spent in the runnning state, which includes file transfer.
If a job self-checkpoints, each checkpoint will reset the execute duration timer, but not the job duration timer.
AllowedExecuteDuration
The longest time for which a job may be executing. Jobs which exceed
this duration will go on hold. This time does not include file-transfer
time. Jobs which self-checkpoint have this long to write out each
checkpoint.
AllowedJobDuration
The longest time for which a job may continuously be in the running state.
Jobs which exceed this duration will go on hold. Exiting the running
state resets the job duration measured by this attribute.
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Belakovski, Nickolai via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Tuesday, October 8, 2024 12:23 PM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Cc: Belakovski, Nickolai <nickolai.belakovski@xxxxxxxx> Subject: Re: [HTCondor-users] Max Runtime Limit
You know, now that I play around with it, I can't tell the difference between these two variables. I set up a job to sleep for 20 seconds and tried it with an allowed_execute_duration of 10s and similarly for allowed_job_duration (I submitted two separate jobs,
one with execute and one with job).
In both cases the job would hold after 10-15 seconds (so it's not super exact but ok). In both cases I could
condor_release the jobs, but runtime would reset to 0 in both cases, and also in both cases it would just go for 10-15 seconds and then stop again. I would have thought that releasing a job held by allowed_job_duration would let it continue till
it ended, as opposed to restarting it.condor_q -hold will should different text based on execute/job, but otherwise I can't find any difference between the two.From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Sandeep K. Joshi via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Tuesday, October 8, 2024 10:34 AM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Cc: Sandeep K. Joshi <skjoshi@xxxxxxxx> Subject: Re: [HTCondor-users] Max Runtime Limit Thank you Nickolai and John. I had this question because I was trying to
understand why some of the jobs in our HTCondor cluster were having a high JobRunCount. I spent sometime understanding the fair share system of HTCondor and the related user priority handling and realized that the jobs might be getting vacated for ensuring fair share of the resources. I am attempting to get the numbers correct though. Thanks again. Regards, Sandeep On 08/10/24 19:11, John M Knoeller via HTCondor-users wrote: > This is correct. There is no default in the AP for all jobs, but you > can use a JOB_TRANSFORM in the AP to set one of these attributes for > jobs as they are submitted. > > -tj > ------------------------------------------------------------------------ > *From:* HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of > Belakovski, Nickolai via HTCondor-users <htcondor-users@xxxxxxxxxxx> > *Sent:* Monday, October 7, 2024 3:32 PM > *To:* HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> > *Cc:* Belakovski, Nickolai <nickolai.belakovski@xxxxxxxx> > *Subject:* Re: [HTCondor-users] Max Runtime Limit > I think you're looking for allowed_execute_duration or > allowed_job_duration, see the reference here: > https://urldefense.proofpoint.com/v2/url?u=https-3A__htcondor.readthedocs.io_en_latest_man-2Dpages_condor-5Fsubmit.html&d=DwIGaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=bs8KPFYOFQY_qeJ080fy9g3C0hA8rcAgeTAvLjPUGZ0&m=kjoHQGrwKOBdGkIkHHKzgxPPWkO4OHAKko9cyRWp4cOvTMFJ-h3iW_uaUClUHe0u&s=rbrcs_N6bXRAM5N-mrHWLirbfMvg8pvkBMQVVZgPHh0&e= > <https://secure-web.cisco.com/1b12QZtCs5MaXbvLxgoq2XeZYJflfiDrgqkIIfHWX-gER_GM_DhHAmccoOg8Y7H-Z4KdpbyTtw9ew-tSzLKFgt5uJyU9D_YL7_ZtyaCY8Q8GAI6u1-Fdn6C8OZuvFEjIrg_XRmbTXCPd7k_KnyOIpdPO50fYe8jF9TnJoO-nGDAG57QcpNmIpDuS8hZ-Zi4kA0OeNCXgOu0h2H6g6GI9BXEkmhnwkE79sPyT6U0xup3q3_PR1UDv-T-_f1jG8wFMhuzJkI1iUzuHLVCIhkQMs3majTAxIGE_AS4wvXa0PmgUfpeC5pug7vv76fHF95k6K72Cf1wgOFH0Q3tZU9s7-dXJ14PA58yIk6SL2obIdNGfwC4C7ztHY-bsAIgkzijmthg0C3B7PXoryrVsnskwCng/https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fhtcondor.readthedocs.io%2Fen%2Flatest%2Fman-pages%2Fcondor_submit.html__%3B%21%21Mak6IKo%21JdCa4tMahE1UJjObrT7j94pd3pnt2DeIYI_Eiyhdjj4Aubcqbmw_t_B5a75N8--k8p12iiqvDAnqr23fWkaW2-5hK3XTl0s%24> > > I think the main difference between them is that if a job exceeds its > allowed_execute_duration then it goes into hold and it can't be put back > into a run state, whereas allowed_job_duration resets when the job > leaves the running state. So I guess job_duration would be better > because if you have a job that it legitimately taking a long time and > not just spinning, you could kick it and let it keep going. But I'm not > sure as I'm only just learning these commands so I don't have practical > experience with them yet. > > Nickolai > ------------------------------------------------------------------------ > *From:* HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of > Sandeep K. Joshi via HTCondor-users <htcondor-users@xxxxxxxxxxx> > *Sent:* Monday, October 7, 2024 10:47 AM > *To:* htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> > *Cc:* Sandeep K. Joshi <skjoshi@xxxxxxxx> > *Subject:* [HTCondor-users] Max Runtime Limit > USE CAUTION: External Message. > > Hello All, > > Is there a htcondor config default value (system side, not specified by > users in their jobscript) for the maximum time a job is allowed to run ? > -- > Regards, > Sandeep > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://urldefense.proofpoint.com/v2/url?u=https-3A__secure-2Dweb.cisco.com_1wXSXDIuDGveP-2DMeTgrDzbeqQmOksIiXjOqIzEIQQ2kchxUgohyGTJXzA9CO57-5FOZ4Aohca-5FwfLHKF655ytG-2D2pbjfbEsgahF-2DQmc-2DqsM1rJ4jFv8TWRW6sZz0XFuXRaPtvC-2DMsoxmZeruPSUepfsEBueRiUzvZUjkc16hpDgJ5jYXnR-5Fzte-5FGJTs6pAHyGxsp9yy6kk0d8rUSylQ9Ki87zRjR77lYGHovybXpyK5wXI5yjGOq9o6F-2DBueD7JBBFZFTPRRmHP9RDMPAInvL2E9IWpJ2TrdjVcXGvCwWIVsC4Fu9xzpSkBzSmJugruDgFkP84ZyePtgH2YiQVFbhQRBEHDijmkrUtHZJPKXW6WWVWF10My3BntXVvmDsCIpftnNJOCdcKH8aQJNvba1CBaIQ_https-253A-252F-252Flists.cs.wisc.edu-252Fmailman-252Flistinfo-252Fhtcondor-2Dusers&d=DwIGaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=bs8KPFYOFQY_qeJ080fy9g3C0hA8rcAgeTAvLjPUGZ0&m=kjoHQGrwKOBdGkIkHHKzgxPPWkO4OHAKko9cyRWp4cOvTMFJ-h3iW_uaUClUHe0u&s=0e1aVqECKl9gwJRHiMbyL06hkAT25unZNNEwbba1MUg&e= > > The archives can be found at: > https://urldefense.proofpoint.com/v2/url?u=https-3A__secure-2Dweb.cisco.com_12LeIIdXFH6abpfa-2DjwxJr62mHyTLWt8To-2DELwxurA9ISyFnmCWIeDvCDxcikMnwkATAddTJ3lqKpbGE3s5tymjXM8-2Dm9vlch2mOKKoQCN0kw8qqmHhm0R82skIjFDpgcl-2D-2DB0x4sQQ9tS7xIJJcuAUn3q1TtSWUnRRIYQUbBPHDWDmwiPmYNxFpabHopBcux3CCyAczYcfIDO6AwX5ddR4CWtzQrS0f52eVFUgZYTF9klOb2kjYMvQA-2D4H75WaTur6aXmVyiLwB51fR0sBvFXEgLwllKGF9h7qMR6h2-5FOPdbaeMwDQ-2DPPy3CunwPZ83ECbRiQ-5FoW5evQPyzgjF-2Dgw2lftRu06yjcxJ362drOLW-5FF-2DHcaQJxleOBYU9qmrGFwpa-5F-2DcoWFf-5F6hEOoM6bAL9A_https-253A-252F-252Flists.cs.wisc.edu-252Farchive-252Fhtcondor-2Dusers-252F&d=DwIGaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=bs8KPFYOFQY_qeJ080fy9g3C0hA8rcAgeTAvLjPUGZ0&m=kjoHQGrwKOBdGkIkHHKzgxPPWkO4OHAKko9cyRWp4cOvTMFJ-h3iW_uaUClUHe0u&s=YFSKCSgKepLKVPFT5QEvU6gRtz3xg7kRLJqTHpVD8VM&e= -- Regards, Sandeep K. Joshi _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIGaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=bs8KPFYOFQY_qeJ080fy9g3C0hA8rcAgeTAvLjPUGZ0&m=kjoHQGrwKOBdGkIkHHKzgxPPWkO4OHAKko9cyRWp4cOvTMFJ-h3iW_uaUClUHe0u&s=cpPVs-z7LgbLZHaobCXXPlwqEHeiSL_lDuKh_TAXIyg&e= The archives can be found at: https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIGaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=bs8KPFYOFQY_qeJ080fy9g3C0hA8rcAgeTAvLjPUGZ0&m=kjoHQGrwKOBdGkIkHHKzgxPPWkO4OHAKko9cyRWp4cOvTMFJ-h3iW_uaUClUHe0u&s=Q-To-Z_vD4K2Ca2_9y4ED8IHFZxJQ7jNTN3iCNtDGz0&e= |