[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] MAX_CONCURRENT_DOWNLOADS not working?



Hello Angel,

I have not had any time to try an reproduce this. We did discuss this at our team meeting. It was pointed out the MAX_CONCURRENT_DOWNLOADS only limits downloads handled by HTCondor itself. Anything that is downloaded by the osdf or curl plugins would not be limited by this knob. They mentioned that another knob would limit those downloads.

Are you using plugins for downloads?

...Tim

On 8/18/25 07:44, Angel de Vicente via HTCondor-users wrote:
Hello,

in a pool of HTCondor, version 23.0.21

,----
| $ condor_q --version
| $CondorVersion: 23.0.21 2025-03-19 $
| $CondorPlatform: X86_64-Ubuntu_22.04 $
`----

I was trying to limit the maximum number of concurrent output files
downloads by using the variable:

,----
| MAX_CONCURRENT_DOWNLOADS (https://urldefense.com/v3/__https://htcondor.readthedocs.io/en/23.0/admin-manual/configuration-macros.html*index-139__;Iw!!Mak6IKo!JpDjwDnqJcbPwyUYR_Vg7cUA57oxpGqmXzT8tuMzHFdlakBFC2PvJtoE1iMJN07UA2LWiJ4HvRD43PB-8bBzXfV_IJRI$ )
|
|     This specifies the maximum number of simultaneous transfers of
|     output files from execute machines to the access point. The limit
|     applies to all jobs submitted from the same condor_schedd. The
|     default is 100. A setting of 0 means unlimited transfers. This limit
|     currently does not apply to grid universe jobs, and it also does not
|     apply to streaming output files. When the limit is reached,
|     additional transfers will queue up and wait before proceeding.
`----

Despite setting it to 3, I can see many jobs transferring output files
(in the example below 10 jobs):

,----
| $ condor_config_val -dump | grep -i download
| MAX_CONCURRENT_DOWNLOADS = 3
|
| $ condor_q -nobatch
|
|
| -- Schedd: xxxxx.es : <161.72.216.45:9618?... @ 08/18/25 13:04:42
|  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
| 1427.0   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14270.dat
| 1427.1   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14271.dat
| 1427.2   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14272.dat
| 1427.3   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14273.dat
| 1427.4   xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test14274.dat
| 1427.6   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14276.dat
| 1427.7   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14277.dat
| 1427.8   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14278.dat
| 1427.9   xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test14279.dat
| 1427.10  xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test142710.dat
| 1427.11  xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test142711.dat
| 1427.12  xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test142712.dat
| 1427.13  xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test142713.dat
| 1427.14  xxx          8/18 12:58   0+00:06:12  > 0    0.0 fallocate -l 10G test142714.dat
| 1427.15  xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test142715.dat
| 1427.16  xxx          8/18 12:58   0+00:06:12 q> 0    0.0 fallocate -l 10G test142716.dat
`----

Any idea why this is happening?


(also, once the transfer starts, I would expect that removing the jobs
via "condor_rm" would stop the transfer and terminate the job, but this
doesn't happen and the output file transfer continues regardless, until
the transfer is finished and then later deleted).

Cheers,

--
Tim Theisen (he, him, his)
Release Manager
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736