[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] MAX_JOBS_SUBMITTED exceeded, submit failed. Current total is 499999. Limit is 50000



HelloÂExperts,

We hit this issue multiple times: Issue disappears if we restart the condor service or change theÂMAX_JOBS_SUBMITTED limit.

02/06/23 20:59:20 (pid:32073) NewCluster(): MAX_JOBS_SUBMITTED exceeded, submit failed. Current total is 600000. Limit is 600000
02/06/23 20:59:25 (pid:32073) NewCluster(): MAX_JOBS_SUBMITTED exceeded, submit failed. Current total is 600000. Limit is 600000
02/06/23 20:59:40 (pid:32073) NewCluster(): MAX_JOBS_SUBMITTED exceeded, submit failed. Current total is 600000. Limit is 600000

We don't have any jobs in the queue, while submitting a batch of 100jobsÂvanila universe above messages appear in sched log file.ÂÂ

# condor_version
$CondorVersion: 8.8.5 Sep 04 2019 BuildID: 480168 PackageID: 8.8.5-1 $
$CondorPlatform: x86_64_RedHat7 $

# condor_config_val MAX_JOBS_SUBMITTED
600000


Issue is not reproducible at desire. Anyone else encountered this issue?Â

Thanks & Regards,
Vikrant Aggarwal


On Wed, Jan 12, 2022 at 9:46 AM Vikrant Aggarwal <ervikrant06@xxxxxxxxx> wrote:
HelloÂThomas,

Thanks for your response. These were 10 plain jobs (not clustered array). Each clustered array had only one job.Â

Thanks for sharing late materialization options. Usually,Âuser batches are very small. I couldn't understand where this limit was hitting.Â

Thanks & Regards,
Vikrant Aggarwal


On Tue, Jan 11, 2022 at 8:23 PM <thomas.hartmann@xxxxxxx> wrote:
Hi Vikrant Aggarwal,

are the 10 jobs on the scheduler 'plain' jobs or job clusters?

If your users have very large arrays of jobs, maybe you can try to use
the late materialization feature
 SCHEDD_ALLOW_LATE_MATERIALIZE = true
so that the scheduler creates the jobs when they are about to get
matched to a slot 9and not before already druing submission).

Cheers,
 Thomas

On 11/01/2022 08.08, ervikrant06@xxxxxxxxx wrote:
> Hello Experts,
>
> We are only having a handful of jobs (10) in queue while submitting a
> new batch following error reported:
>
> 01/10/22 23:40:01 (pid:9386) NewCluster(): MAX_JOBS_SUBMITTED exceeded, submit failed. Current total is 499999. Limit is 50000
>
>
> We have to increase MAX_JOBS_SUBMITTED to make submission successful. We
> have seen this issue twice on submit box. Has anyone seen this issue before?
>
>
> Thanks & Regards,
> Vikrant Aggarwal
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/