Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] HTC and HPC
- Date: Fri, 05 Dec 2014 11:28:21 -0500
- From: Gary Jackson <garyj@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] HTC and HPC
"HPC" is not magic pixie dust. If you don't actually have
tightly-coupled parallel applications that require high performance
computing resources, then installing scheduling software that supports
"HPC" isn't going to do anything useful for you. Unless you have those
specific needs, HTCondor is going to do a lot more for you than those
other schedulers.
As I understand it, the reason you'd use a purpose-built HPC batch
scheduler is because HTCondor's scheduling algorithm isn't as flexible
for parallel jobs. SLURM and Torque give the administrator a lot of
tools for tuning parallel scheduling performance to maximize utilization
or minimize turnaround time. For instance, they support plugging in a
backfill scheduler for running jobs out of priority order when a lower
priority job won't interfere with a higher priority job. Backfill in
HTCondor, though still useful, doesn't work the same way and isn't
useful for tightly-coupled parallel jobs.
Obviously, HTCondor is capable of scheduling and running parallel jobs,
and you can use that if your parallel scheduling needs do not exceed
what HTCondor can provide. HTCondor can start an OpenMPI job just as
easily as SLURM.
On the other hand, you probably wouldn't use SLURM or Torque for the
same sort of high throughput computing you do with HTCondor. HTCondor is
a very sophisticated program that covers a lot more use cases than those
two batch schedulers. For example, HTCondor has support for:
* transparent checkpointing
* running jobs on desktop machines with low impact on end users
* using cloud resources
There's no reason you can't use both HTCondor and a purpose-built
parallel batch scheduler at the same time. Locally, we've used both
Torque and HTCondor on our HPC clusters for many years. When Torque jobs
run, they preempt any HTCondor jobs and the nodes leave the pool for the
duration of the parallel job. It's worked out well.
On 12/4/14, 6:10 AM, marrodriguez wrote:
Hi
Hi
I have interest to implement condor on my site, but I have a doubt, Why
condor is consider a HTC batch system and not HPC. it have some
disadvantage on HPC field respect Slurm, PBS vs SGE?
Thanks in advanced
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Gary