Hi Peet
Maybe look at Q_Query_Timeout = 20
Not_Responding_Timeout = 3600
Sam
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Peet Whittaker <Peet.Whittaker@xxxxxxxxxxxxxxxxx>
Sent: Tuesday, September 19, 2023 9:03:00 AM
To: HTCondor-Users Mail List Subject: [EXTERNAL] [HTCondor-users] Submit transaction timeout Hi,
When using the Python API to submit a large number of jobs (10’s of thousands), we encounter the following error:
RuntimeError: Failed to commit and disconnect from queue.
We use the following code to submit jobs in chunks:
max_jobs_per_sub = htcondor.param['MAX_JOBS_PER_SUBMISSION'] for itemdata_chunk in common.utils.iter_data_chunks(itemdata, max_jobs_per_sub): with schedd.transaction() as txn: submit.queue_with_itemdata(txn, 1, iter(itemdata_chunk))
If we use a smaller chunk size (say 5,000 rather than the default 20,000), we still encounter the error once a certain number of jobs have been submitted (usually around 30-50k).
Looking at the logs and based on this message thread [www-auth.cs.wisc.edu] it would seem that we’re hitting the schedd’s 20 second transaction timeout. Is there any way of increasing or avoiding this timeout?
The pool and central manager all run on Windows.
Kind regards,
Peet Whittaker Discipline Lead for DevOps | Principal Software Developer
JBA Consulting, 1 Broughton Park, Old Lane North, Broughton, Skipton, North Yorkshire, BD23 3FD. Telephone: +441756699500
NOTICE: This email message and all attachments transmitted with it may contain privileged and confidential information, and information that is protected by, and proprietary to, Parsons Corporation,
and is intended solely for the use of the addressee for the specific purpose set forth in this communication. If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other
use of this message or its attachments is strictly prohibited, and you should delete this message and all copies and backups thereof. The recipient may not further distribute or use any of the information contained herein without the express written authorization
of the sender. If you have received this message in error, or if you have any questions regarding the use of the proprietary information contained therein, please contact the sender of this message immediately, and the sender will provide you with further
instructions.
|