[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] bug in schedd_negotiate.cpp - 144-core startd



"HTCondor-users" <htcondor-users-bounces@xxxxxxxxxxx> wrote on 11/20/2015 04:08:17 AM:

> From: Thomas Hartmann <thomas.hartmann@xxxxxxxx>

> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Date: 11/20/2015 04:10 AM
> Subject: Re: [HTCondor-users] bug in schedd_negotiate.cpp - 144-core startd
> Sent by: "HTCondor-users" <htcondor-users-bounces@xxxxxxxxxxx>
>
> hi,
> i just looked it up. lscpu says we have 8 xeon e7-8895 v3 @ 2.6ghz. i do
> not count the hyperthreading cpus. its some machine by oracle, if i
> remember correctly. i could ask the IT department if you need specifics.
> just let me know.
>
> the machine was not bought or configured by me. our IT department did and
> according to them the information engineering department needs a machine
> that provides a massive amount of cores and RAM (2T in this case) in one
> machine. apparently, they are not using it at the moment so the IT
> department asked me if i wanted to use it as a HTCondor execute machine. i
> gladly accepted.
>
> however, you can see that the machine was not designed for the purpose of
> being an execute machine in an HTCondor pool. if i would configure a setup
> for it, i would also go for cheaper CPUs and more machines.

Thanks for the info! That's a really good HTCondor story, too - you should put it on your annual accomplishments list.

Looks like it must be an Oracle Server X5-8, and those suckers cost so much Oracle won't even give you a price online. The processors alone are probably in the neighborhood of $70-$80,000, and 2TB of memory in 32GB LRDIMMs is another $64,000 list price, so you're already pushing $150,000 before you even start installing all that in the chassis.

Suppose that one machine cost $200,000 - over a two year period that comes to $273 (255 euro) per day, meaning that every day that server was not in use, and thus not contributing to creating business value, your company was just bleeding that out. Your ability to add it to the HTCondor pool with a few keystrokes means that you recovered that machine's value for the company, or at least 69.4% of it until this 1%-minimum-disk thing gets straightened out. :D ("Nobody will ever need more than 100 processors in a single system" - reminds me of that mythical Bill Gates quote...)

So, well done!

And just imagine, if you turned on Hyperthreading you'd have 288 slots with 7GB of memory each. Wow! And to think I started learning computers in grade school on a Commodore PET with a whopping 8 kilobytes of memory and a 0.001GHz 6502 processor. Amazing...

[I've found for our usual HTCondor workloads modern Hyperthreading only imposes at the most a 20-30% penalty in per-job speed, so doubling the number of slots increases the overall throughput (jobs-per-day) substantially.]

        -Michael Pelletier.