[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Slow schedd history queries in python



Hi,

Thanks to both of you guys (Jason), I had looked at the bindings TUTORIAL but not the API reference.  I see as well in the htcondor2 bindings that history now returns a list instead of an iterator, making my code even simpler, so I will switch.

JT



On 22 May 2025, at 23:41, John M Knoeller via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

To get similar performance, you need to use the since argument of python history rather than (or in addition to) theconstraint argument. 

constraint selects which records to return
since controls when to stop scanning the history file. 

-tj



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jeff Templon <templon@xxxxxxxxx>
Sent: Thursday, May 22, 2025 8:39 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Slow schedd history queries in python 

Addendum:

Is there a way to make this
similarly fast as the command line version?

should have made explicit that I meant, can the schedd.history() usage be made as fast as the command line version.

Of course, running the unix command under the hood is always a possibility.

JT



On 22 May 2025, at 15:30, Jeff Templon <templon@xxxxxxxxx> wrote:

Hi,

This query:

condor_history -completedsince $(date -d "10 minutes ago" +"%sâ)

Takes on the order of a tenth of a second from the unix command line.

This bit of code:


shist_iter = schedd.history(
constraint='CompletionDate > %d' % (now - 600),
projection=[
"ProcId",
"ClusterId",
"JobStatus",
"AccountingGroup",
"AcctGroup",
"Owner",
"CompletionDate",
"CpusProvisioned",
"JobCurrentStartDate",
"JobCategory", "JobUniverse"
]
)

print("schedd hist query done")

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

# unfortunately, schedd.history returns a HistoryIterator, unlike Schedd.query which returns a list. The lines below
# make lists corresponding to the contents of the HistoryIterators, so we can use them more than once.

print("start iter to list")

shist = list()
for ad in shist_iter:
shist.append(ad)

print("done iter to listâ)


Is similarly quick until the âfor ad in ââ bit, which takes order 7 seconds, even though there are only 160 records in that query. Is there a way to make this
similarly fast as the command line version?

JT



e
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/