Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] condor_schedd stuck when submitting a large amount of jobs
- Date: Fri, 10 Mar 2023 20:53:03 +0000
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] condor_schedd stuck when submitting a large amount of jobs
If the submission of the large amount of jobs is in a single submit action (i.e. single condor_submit or Python bindings submit() call), then the answer is ânoâ. A submit request is a synchronous operation for the condor_schedd. It will do nothing else until the submission is completed.
As you note, âmax_materializeâ is a good way to reduce the time the schedd spends on the job submission. I donât believe thereâs any way for the admin to trigger its use for large job submissions.
We discourage the use of frequent âcondor_q -global -allâ commands. We can discuss alternative solutions if youâre relying on this command.
- Jaime
> On Mar 10, 2023, at 2:24 PM, JM <jm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> HTCondor Community,
>
> In an environment with multiple condor_schedd on different servers, we experienced an issue that "condor_q -global -all" stuck on a submit host for a short period of time while a large amount of jobs were being submitted via that submit host.
>
> I understand that users may use max_materialize to release the pressure on the condor_schedd. But is there a knob for admins to give condor_schedd high priority to response queries while working on the new submission?
>