Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] job submission on geographically distributed system
- Date: Mon, 11 May 2020 09:37:23 -0500 (CDT)
- From: Todd L Miller <tlmiller@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] job submission on geographically distributed system
I would like to implement hpc job to be submitted on any of the
available hpc clusters (geographically distributed) without grid
implementation.
I don't know what this restriction is supposed to mean, so I'll
ignore it for the rest of this reply; sorry.
Would HT condor help in this context? Most of the hpc clusters have
already slurm running and scheduling jobs locally of that particular
cluster.
HTCondor can certainly submit jobs to multiple different Slurm
clusters.
Can a user submit the job on one of the hpc systems (best available in
terms of resources) from central location?
That I don't actually know.
How it can be done? How the user identity, data, application and
environment would be taken care of?
When you write a HTCondor job, you specify the application, data,
and environment that constitute the job, and HTCondor takes care of moving
the files around and constructing the environment when the job runs.
(Or, in this case, lands at a Slurm scheduler.)
User identity is a much harder problem, but HTCondor supports
(through various mechanisms) mapping between the user who submitted the
job and the identity used to run the job at various different sites.
Ideally, you would be able to specify the job in such a way that the user
identity at the time the job was running didn't matter. This doesn't and
sometimes can't always happen, but the user identity problem is too
complex to discuss is any detail here. Luckily, there is already a
solution you could use or learn from: the "CE".
https://htcondor-ce.readthedocs.io/en/latest/overview/
In your case, you probably wouldn't be using pilots.
- ToddM