Re: [HTCondor-users] condor_adstash behaviour when dealing with big amounts of jobs

Hi Jason,

Thanks a lot for the very detailed explanation.

The only backfilling method that definitely works with adstash right now is to use the --ad_file option pointed to the history files on disk. I think there are opportunities to try to do better both with remote history calls (e.g. by allowing custom constraint and since expressions rather than only looking at the checkpoint file) and reading from history files (e.g. also by allowing custom constraints), so I've opened a ticket and will be working on a design there: https://opensciencegrid.atlassian.net/browse/HTCONDOR-3793

OK, I can consider the âad_file option as a workaround in this case. In any case, it would be good to have a âsince mechanism. Thanks!

For some background, the Schedd.history method ( https://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/version2/htcondor2/schedd.html#htcondor2.Schedd.history ) opens a connection to the condor_schedd, the schedd forks a condor_history child process, and the history process then reads the flat history file(s) on disk in *reverse chronological order*, returning ads one by one until:
1. an optionally set "match" number of ads are *returned*,
2. an optionally set "since" _expression_ becomes true, or
3. HISTORY_HELPER_MAX_HISTORY (configured on the access point) number of ads are *read*.

In my config:

ADSTASH_SCHEDD_HISTORY_MAX_ADS = 2000000000

And in the schedds:

HISTORY_HELPER_MAX_HISTORY = 2000000000

Note how there's no "4. HISTORY_TIMEOUT is reached"... because there is no condor_history/Schedd.history timeout setting. To clarify, the timeout options in adstash control how long the parent adstash process waits for history to be processed for a schedd, it does not control the timeout for the individual Schedd.history call because there is (unfortunately) no setting for that. Does the timeout behavior you were seeing match with this understanding, i.e. were you always hitting the timeout you configured no matter how long you set it for?

I was hitting the timeout defined in --schedd_history_timeout.

I have installed the condor_adstash nodes from scratch. It means thereâs no checkpoint file. For the big schedds it times out. I guess I will have to put a limit or it will never manage to query them. For smaller schedds however, I see itâs hitting some limit too in the number of read records:

[root@adstash-prod-ce-01 ~]# more /var/log/condor/condor_adstash/condor_adstash.log | grep count

WARNING:root:Schedd ce-test05.cern.ch history: response count: 2176; upload time: 0.09 min

WARNING:root:Schedd ce503.cern.ch history: response count: 50000; upload time: 0.76 min

WARNING:root:Schedd ce504.cern.ch history: response count: 50000; upload time: 0.75 min

WARNING:root:Schedd ce507.cern.ch history: response count: 50000; upload time: 0.78 min

WARNING:root:Schedd ce509.cern.ch history: response count: 50000; upload time: 0.76 min

WARNING:root:Schedd ce510.cern.ch history: response count: 50000; upload time: 0.82 min

WARNING:root:Schedd ce513.cern.ch history: response count: 50000; upload time: 0.76 min

WARNING:root:Schedd ce514.cern.ch history: response count: 50000; upload time: 0.77 min

WARNING:root:Schedd ce515.cern.ch history: response count: 50000; upload time: 0.74 min

Do you know where this 50K limit could come from? I havenât defined this myself.

The "scan every ad in reverse order" i/o intense behavior of condor_history is what makes backfilling more than a few hours on busy schedds extremely slow. Even if you provide a perfect constraint _expression_ to capture missing ads, condor_history still has to read through the entire history on disk to figure out which ads match that constraint. (This is why adstash writes out checkpoints to generate "since" expressions per schedd, so condor_history knows when it can exit early.) Nothing is (explicitly) cached in memory between condor_history calls. The HTCSS team has started to address this with the new Archive Librarian feature that puts some per-ad metadata for each ad in the history file in a SQLite database, but this only works for the current history file now, not rotated history files: https://htcondor.readthedocs.io/en/latest/admin-manual/ap-policy-configuration.html#archive-librarian

I know this is not a very satisfying answer but the current design of the history files and reading methods make tackling the backfill problem difficult.

OK, thanks for the details.

To add one more thing to think about... do you rotate your Opensearch indexes (e.g. with ILM policies)? If so, if we were to add a backfilling feature, I'm still not sure how to prevent duplicate ads after an index rotation occurs.

Yes, I have monthly indexes. Openseach colleagues have told me that If you specify a document ID (_id) that already exists, OpenSearch will overwrite the existing document with the new one. If you donât specify an ID, OpenSearch will automatically generate a new one. This means it will create a new document, even if the content is the same as an existing document. My understanding is that condor_adstash always generates an ID based on {GlobalJobId}#{RecordTime}, so it will overwrite. Itâs not clear to me if this is only an Index level. I will ask.

Regards,

Maria