[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_schedd stuck when submitting a large amount of jobs



If the submission of the large amount of jobs is in a single submit action (i.e. single condor_submit or Python bindings submit() call), then the answer is ânoâ. A submit request is a synchronous operation for the condor_schedd. It will do nothing else until the submission is completed.

As you note, âmax_materializeâ is a good way to reduce the time the schedd spends on the job submission. I donât believe thereâs any way for the admin to trigger its use for large job submissions. 

We discourage the use of frequent âcondor_q -global -allâ commands. We can discuss alternative solutions if youâre relying on this command.

 - Jaime

> On Mar 10, 2023, at 2:24 PM, JM <jm@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
> HTCondor Community,
> 
> In an environment with multiple condor_schedd on different servers, we experienced an issue that "condor_q -global -all" stuck on a submit host for a short period of time while a large amount of jobs were being submitted via that submit host.
> 
> I understand that users may use max_materialize to release the pressure on the condor_schedd. But is there a knob for admins to give condor_schedd high priority to response queries while working on the new submission?
>