Re: [HTCondor-users] MPI on Windows

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Martin,

I greatly appreciate the details of your configuration.

Along with reviewing the docs you provided, I'm exploring MPI related links found at https://htcondor-wiki.cs.wisc.edu/index.cgi/wikitoc

Greg,

This is our setup - I think this means we need a parallel universe, but please let me know if vanilla would work, too.

We will have a cluster of 8 compute nodes, each with dual 16-core HT CPUs, for a total of 256 slots, and an NVidia RTX A4000 GPU.
The head node will be our domain controller, Central Manager and Submitter. We may have 2 or 3 domain-attached hosts actually submitting jobs via the Submitter.

We have two primary (Windows) applications to run on these systems. One uses the GPUs, the other uses MPI.

Open to ideas and guidance on how to implement this with HTCondor.

Thanks,

Sam

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Beaumont, Martin <Martin.Beaumont@xxxxxxxxxxxxxxx>
Sent: Monday, September 11, 2023 8:57:58 AM
To: HTCondor-Users Mail List
Subject: [EXTERNAL] Re: [HTCondor-users] MPI on Windows

Hello Sam,

I’ve got some experience with MPI jobs using condor, but only with Linux. We haven’t had a requirement for Windows so far (thankfully).

If you haven’t already, you should probably read the documentation for MPI jobs:

https://htcondor.readthedocs.io/en/latest/users-manual/parallel-applications.html [htcondor.readthedocs.io]

https://htcondor.readthedocs.io/en/latest/admin-manual/setting-up-special-environments.html#htcondor-s-dedicated-scheduling [htcondor.readthedocs.io]

Here’s how I personally configure my clusters. (most basic settings only)

Central manager server (all-in-one):

/etc/condor/config.d/01-cm.config

# Common configuration (MASTER)

CONDOR_HOST = $(hostname --short)

ALLOW_DEAMON = $NET_INT_PREFIX.*

# Configure host for central management (COLLECTOR, NEGOTIATOR)

use ROLE: get_htcondor_central_manager

# Configure host for submission of jobs (SCHEDD)

use ROLE: get_htcondor_submit

# Enable partitionable slot preemption

ALLOW_PSLOT_PREEMPTION = True

# Speed up reclaiming of unused slots

UNUSED_CLAIM_TIMEOUT = 20

And for the execute nodes (compute servers):

/etc/condor/config.d/02-role-execute.config

# Common configuration (MASTER)

CONDOR_HOST = $(hostname --short)

ALLOW_DEAMON = $NET_INT_PREFIX.*

# Configure host for jobs execution (STARTD)

use ROLE: get_htcondor_execute

# Link node to central manager

UID_DOMAIN = $(hostname --short)

TRUST_UID_DOMAIN = TRUE

# Prioritize parallel jobs over serial

DedicatedScheduler = "DedicatedScheduler@$(hostname --short)"

STARTD_ATTRS = \$(STARTD_ATTRS), DedicatedScheduler

START = True

SUSPEND = False

CONTINUE = True

PREEMPT = False

KILL = False

WANT_SUSPEND = False

WANT_VACATE = False

RANK = Scheduler =?= \$(DedicatedScheduler)

# Activate Dynamic slots configuration and slot partitioning

NUM_SLOTS = 1

NUM_SLOTS_TYPE_1 = 1

SLOT_TYPE_1 = auto

SLOT_TYPE_1_PARTITIONABLE = True

Replace $(hostname --short) with the network name of your central manager (CM).

In my setup, $NET_INT_PREFIX is the first 2 numbers of the IP range of the dedicated local network between the central manager and the execute nodes.

I use IDTOKEN security. https://htcondor.readthedocs.io/en/latest/admin-manual/security.html#highlights-of-new-features-in-version-9-0-0 [htcondor.readthedocs.io]

This way, both MPI and serial jobs can be submitted and executed across all nodes, with MPI jobs being prioritized (as in, they can’t be bumped during preemption), and with the CM releasing claimed dynamic partitioned slots if Idle for more than 20 seconds.

There might be better ways to configure this, but it gets the job done. :)

As for submit files and wrappers, they are unique to every R&D software we use. Although, I’ve only used Open MPI so far. My wrappers are modified versions of the openmpiscript example.

I haven’t tried the MPICH examples (mp1script, mpi2script). I do not think there’s an example file for MPI for Windows.

If you don’t run jobs across multiple execute nodes, then as Greg mentioned, the vanilla universe might be simpler with MPI for Windows.

(vanilla universe does not accept machine_count in submit files)

Martin

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Greg Thain via HTCondor-users
Sent: September 11, 2023 10:56 AM
To: htcondor-users@xxxxxxxxxxx
Cc: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] MPI on Windows

On 9/8/23 17:51, Sam.Dana@xxxxxxxxxxx wrote:

Looking at condor_config.local.dedicated.submit, in the statement,

"If your dedicated resources are configured to only run jobs, you should probably set this attribute to '0'",

does "only run jobs" mean "only run dedicated jobs" to correlate with Policy 1 in condor_config.local.dedicated.resource?

It does, but that's a small optimization. To run parallel/dedicated jobs, I'd leave UNUSED_CLAIM_TIMEOUT

at the default value of 600 unless you have a good reason to change it, though.

Looking at condor_config.local.dedicated.resource, I wonder:

what settings are needed to run MPI apps within HTCondor on Windows?

Generally speaking, the most "High Throughput" way to run MPI jobs is to run a lot of

independent MPI jobs that each run on one node in your pool, perhaps on many cores on one node.

This can be done in the vanilla universe. If you absolutely must run MPI jobs across multiple

nodes, then you will need to run the parallel universe.

To run MPI jobs on the parallel universe, you'll need scripts to bootstrap the MPI world. To

be honest, I don't know of anyone who has done this on WIndows in quite some time, and

I don't know how up to date the examples we provide are with any modern version of

MPI for Windows.

-greg

Thanks,

Sam

NOTICE: This email message and all attachments transmitted with it may contain privileged and confidential information, and information that is protected by, and proprietary to, Parsons Corporation, and is intended solely for the use of the addressee for the specific purpose set forth in this communication. If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited, and you should delete this message and all copies and backups thereof. The recipient may not further distribute or use any of the information contained herein without the express written authorization of the sender. If you have received this message in error, or if you have any questions regarding the use of the proprietary information contained therein, please contact the sender of this message immediately, and the sender will provide you with further instructions.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users [lists.cs.wisc.edu]
 
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/ [lists.cs.wisc.edu]

Mailing List Archives

Authenticated access

Re: [HTCondor-users] MPI on Windows