[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Slow schedd history queries in python



To get similar performance, you need to use the since argument of python history rather than (or in addition to) the constraint argument. 

constraint selects which records to return
since controls when to stop scanning the history file. 

-tj



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jeff Templon <templon@xxxxxxxxx>
Sent: Thursday, May 22, 2025 8:39 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Slow schedd history queries in python

Addendum:

Is there a way to make this
similarly fast as the command line version?

should have made explicit that I meant, can the schedd.history() usage be made as fast as the command line version.

Of course, running the unix command under the hood is always a possibility.

JT



On 22 May 2025, at 15:30, Jeff Templon <templon@xxxxxxxxx> wrote:

Hi,

This query:

condor_history -completedsince $(date -d "10 minutes ago" +"%s”)

Takes on the order of a tenth of a second from the unix command line.

This bit of code:


shist_iter = schedd.history(
constraint='CompletionDate > %d' % (now - 600),
projection=[
"ProcId",
"ClusterId",
"JobStatus",
"AccountingGroup",
"AcctGroup",
"Owner",
"CompletionDate",
"CpusProvisioned",
"JobCurrentStartDate",
"JobCategory", "JobUniverse"
]
)

print("schedd hist query done")

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

print("start iter to list")

shist = list()
for ad in shist_iter:
shist.append(ad)

print("done iter to list”)


Is similarly quick until the “for ad in …” bit, which takes order 7 seconds, even though there are only 160 records in that query. Is there a way to make this
similarly fast as the command line version?

JT



e
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/