[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Migrating to htcondor2 -> pypi LTS supported version?



Hi Todd,

Thank you for the explanation.
We will change things accordingly to have consistent dictionary keys when using itemdata.

As for why this behavior: CMS sometimes submits jobs from different workflows into the same submit() process, but with similarÂrequirements Â, in an effort to reduce the number of auto clustersÂper schedd (although adding 'My.DESIRED_ExtraMatchRequirements' kinda breaks, so I will make a note for the team on that). We also end up with a single condor submit per submission cycle, but I think this is less relevant. Could you please confirm if this approach makes sense â or if that measurement doesnât actually help with autoclusters, let us know?

I am adding Marco, AntonioÂand Alan to the thread in case they want to expand more (or correct me)ÂÂon the above.

Best regards,
Kenyi


On Mon, Oct 27, 2025 at 4:28âPM Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:
> The bug seems to be related to the fact we have 2 jobs, one with
> DESIRED_ExtraMatchRequirements, and another without it.

    The documentation* states that when itemdata is supplied as an
iterator over dictionaries, only the first dictionary's keys will be used.
That's because itemdata is intended to a Pythonic way of duplicating the
QUEUE FROM (et alia) submit-language commands, and this was felt to be
less user-hostile than refusing the submit because you were handing us
dictionaries with extra data in them.

    Thanks for the test case though: it revealed a bug where the extra
attribute in `job2` should have been silently ignored based on the
`htcondor2.Schedd.submit()` documentation. I have opened a ticket,
HTCONDOR-3351, to fix this.)

    Unless there's a good reason for cramming both jobs into a single
cluster, I'd recommend just passing the dictionaries involved to the
Submit object directly:

sub1 = htcondor.Submit("""
    Âuniverse = vanilla
    Âexecutable = test.sh
    Âoutput = out.$(Cluster)-$(Process)
    Âlog = log.$(Cluster).log
    Â""",
    Â** job1
)

sub2 = htcondor.Submit("""
    Âuniverse = vanilla
    Âexecutable = test.sh
    Âoutput = out.$(Cluster)-$(Process)
    Âlog = log.$(Cluster).log
    Â""",
    # If job2 defines every key in job1, you don't need this line.
    Â** job1,
    ** job2,
)

    If there is, you just need to have the keys all be in the first
dictionary, which can you ensure programatically:

missing_keys = {x: None for x in job2.keys() - job1.keys()}
job1.update(missing_keys)

or in this specific case by adding

'My.DESIRED_ExtraMatchRequirements': '',

to the `job1` dictionary.

    If you could share more about what you're trying to accomplish by
having the two procs be in the same cluster, we might be able to help you
better.

-- ToddM

*: https://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/version2/htcondor2/schedd.html#htcondor2.Schedd.submit