Re: [HTCondor-users] Slow schedd history queries in python

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

On Thu, May 22, 2025 at 8:45âAM Jason Patton <jpatton@xxxxxxxxxxx> wrote:

Hi JT,

When using a constraint _expression_ with history, the _expression_ is applied to every job ad in the history. As you've probably found out with the CLI tool, using "-completedsince" speeds things up by stopping the search as soon as the history tool finds a job that has been completed before that date. There is a similar flag to the Schedd.history() function called "since" that you can pass an _expression_ to that, once it becomes true, ends the search early:

$ python3 -m timeit "import time, htcondor2; s = htcondor2.Schedd(); list(s.history(constraint=f'CompletionDate > {time.time() - 3600}'))"
1 loop, best of 5: 3.7 sec per loop

$ python3 -m timeit "import time, htcondor2, classad2; s = htcondor2.Schedd(); list(s.history(since=classad2.ExprTree(f'CompletionDate > {time.time() - 3600}')))"
20 loops, best of 5: 8.26 msec per loop

(Note that calling Schedd.history() returns a generator, and the history call doesn't actually happen until it is consumed, e.g. using list() on it or using it in a loop.)

Jason

On Thu, May 22, 2025 at 8:40âAM Jeff Templon <templon@xxxxxxxxx> wrote:
Addendum:

Is there a way to make this
similarly fast as the command line version?

should have made explicit that I meant, can the schedd.history() usage be made as fast as the command line version.

Of course, running the unix command under the hood is always a possibility.

JT

On 22 May 2025, at 15:30, Jeff Templon <templon@xxxxxxxxx> wrote:

Hi,

This query:

condor_history -completedsince $(date -d "10 minutes ago" +"%sâ)

Takes on the order of a tenth of a second from the unix command line.

This bit of code:

shist_iter = schedd.history(
constraint='CompletionDate > %d' % (now - 600),
projection=[
"ProcId",
"ClusterId",
"JobStatus",
"AccountingGroup",
"AcctGroup",
"Owner",
"CompletionDate",
"CpusProvisioned",
"JobCurrentStartDate",
"JobCategory", "JobUniverse"
]
)

print("schedd hist query done")

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

print("start iter to list")

shist = list()
for ad in shist_iter:
shist.append(ad)

print("done iter to listâ)

Is similarly quick until the âfor ad in ââ bit, which takes order 7 seconds, even though there are only 160 records in that query. Is there a way to make this
similarly fast as the command line version?

JT

e
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!M6p6iiuabGi041eyJjj0RxBkJWx9sdPVfKgtqX7qlW5_wT3Quueh6jYfNKYVo8UHdr8KNPiwpy9IqUXIOFdb$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Slow schedd history queries in python