[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [External] Re: Concatenating history files?



Greg,

 

Thanks for the container suggestion! That could work! However Iâm dealing with a classified air-gapped network so getting Docker or one of its cousins up and running would invoke a panoply of paperwork, not to mention approvals for the newer version of HTCondor.

 

Stefano,

 

Thanks very much for that Python suggestion! Iâll take a look and see if it might be able to do what I need. Iâve got about 60 gigabytes of history data when all is said and done so speed is a significant consideration. Probably would want to pickle the dict for multiple runs.

 

One of the tasks for this history is to categorize the jobs based on the JobDescription and/or Cmd/Arguments, and I was thinking of using an IfThenElse() _expression_ to apply categories based on a collection of regexps, but I have a feeling that might take many, many hours to run. Iâll do some testing and see how it goes.

 

It looks like the oneliner would only grab a single job out of the file, the last one it finds. Iâll tinker with it to see if I can build out a dict of arrays or something like that, making sure that the index within each attribute keyâs array lines up to other keys. Or maybe read each ad one at a time looking for a blank line or *** line, and stash the whole attribute dict for the job in question in an outer dict under a âClusterId.ProcIdâ key.

 

Thanks again! Iâll let the group know how it goes.

 

Michael Pelletier

Principal Technologist

High Performance Computing

Classified Infrastructure Services

 

C: +1 339.293.9149
michael.v.pelletier@xxxxxxx

 

From: Stefano Dal Pra <stefano.dalpra@xxxxxxxxxxxx>
Sent: Thursday, September 12, 2024 5:24 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Pelletier, Michael V RTX <Michael.V.Pelletier@xxxxxxx>
Subject: [External] Re: [HTCondor-users] Concatenating history files?

 

I had to do something similar years ago and tried two ways:
1) condor_q -jobads history.<ClusterId>.<ProcId> -af:j '<classad_functions to extract what i need>'

2) load the hist.file into a python dict and process it; this can be done with a one liner:
dict([map(str.strip, x.split('=',1)) for x in f.readlines()])

then extract what you need.

Solution 1 was more appealing to me, but turned out to be much slower (probably due to overhead in loading parsers, which is done once per file).
Solution 2 was pretty fast in my use case: extracting fields of interest and load them into a postgres table using a COPY  statement.

Stefano

On 11/09/24 20:16, Pelletier, Michael V RTX via HTCondor-users wrote:

Thanks very much for the tip, Cole!

 

My trouble is that weâre still on version 8, and since weâre drawing down the cluster in question thereâs no funding to address an upgrade to version 10 or later. Sorry, I should have specified a version in my original message. Any alternatives available in v8? Iâm thinking maybe not since the -search option may not have been introduced as a new feature.

 

A for loop with multiple invocations of condor_history -file should do the trick if thatâs the only avenue available in the outdated release.

 

Michael Pelletier

Principal Technologist

High Performance Computing

Classified Infrastructure Services

 

C: +1 339.293.9149
michael.v.pelletier@xxxxxxx

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Cole Bollig via HTCondor-users
Sent: Wednesday, September 11, 2024 10:24 AM
To: HTCondor-Users Mail List
<htcondor-users@xxxxxxxxxxx>
Cc: Cole Bollig
<cabollig@xxxxxxxx>
Subject: [External] Re: [HTCondor-users] Concatenating history files?

 

Hi Michael,

 

Since version V10.3.0, you can do condor_history -search /path/to/filename. This will find and read (in correct order) all matching timestamp rotated history files so in this example the following files would be parsed by condor_history:

  1. /path/to/filename
  1. /path/to/filename.20240911092145
  1. /path/to/filename.20240825155501

 

Cheers,

Cole Bollig


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Pelletier, Michael V RTX via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Wednesday, September 11, 2024 9:10 AM
To: HTCondor-Users Mail List (
htcondor-users@xxxxxxxxxxx) <htcondor-users@xxxxxxxxxxx>
Cc: Pelletier, Michael V RTX <
Michael.V.Pelletier@xxxxxxx>
Subject: [HTCondor-users] Concatenating history files?

 

Hi folks,

 

Iâve got a huge amount of job history Iâm trying to go through and summarize/categorize, to the tune of many gigabytes, and as you might expect itâs divided into a collection of rotated files with the usual timestamps.

 

Iâm trying to use the -file option, so that it doesnât bother the server and suffer the constraints of network connection and can work directly from a local filesystem where Iâve stashed the files.

 

Is there a way to enable condor_history to scan all the files in one fell swoop, rather than going through them one at a time with separate condor_history -file commands? I tried concatenating the files but it looks like the last line in each file has some metadata that condor_history pays attention to.

 

Thanks for any suggestions!

 

Michael V Pelletier

Principal Technologist

High Performance Computing

Classified Infrastructure Services



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
 
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/