Hi Matt, Here is a short description of a possible setup similar to what we use.We have a service machine that runs the HTCondor negotiator, collector and scheduler. You can run these services also on a VM. We use the same machine to run COBalD. COBalD needs privileges to perform condor_status and condor_drain commands. The worker VMs themselves run only a HTCondor Master and StartD, which connects to the pool of the service machine.
I answer your questions inline. In case you have additional questions, we can also have a short chat.
Regards, Matthias On 10/3/21 00:11, M.T.West@xxxxxxxxxxxx wrote:
Hi Matthias, As I am new to OpenStack, so how a bunch of the services and daemons work together is a bit confusing. - Does COBalD daemon have to be running on every potential worker node?
ÂÂÂ No, you need to run COBalD only on the service machine
ÂÂÂ On the workers, a HTCondor StartD is running inside the VM. There are no special HTCondor configurations for COBalD/TARDIS necessary. However, I would recommend you to configure an auto-shutdown for idle VMs. See https://htcondor.readthedocs.io/en/latest/cloud-computing/annex-customization-guide.html?highlight=DEFAULT_MASTER_SHUTDOWN_SCRIPT#image-requirements and use DEFAULT_MASTER_SHUTDOWN_SCRIPT and STARTD_NOCLAIM_SHUTDOWN macros.- Do the HTCondor daemons run on bare-metal or in a special VM configured for running HTCondor workloads?
ÂÂÂ I'm not sure what exactly you mean. We use the docker universe from HTCondor to run jobs in containers (https://htcondor.readthedocs.io/en/latest/users-manual/docker-universe-applications.html). The users can define which docker image should be used. Since HTCondor run inside a VM, docker works as on a bare-metal worker node.- How are jobs wishing to run in containers handled?
While I don't yet understand the setup, that this software is running so well in production speaks well of it. Cheers, Matt -----Original Message----- From: HTCondor-users<htcondor-users-bounces@xxxxxxxxxxx> On Behalf Ofmatthias.schnepf@xxxxxxx Sent: 07 September 2021 04:28 PM To:htcondor-users@xxxxxxxxxxx Subject: Re: [HTCondor-users] Backfill on an OpenStack system CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe. Hi all, dropping this here since it's likely to be viable for this use-case. At KIT (WLCG Tier1 and university Tier3) we developed COBalD/TARDIS [0] to integrate resources into an HTCondor pool from various providers [1]. There's a medium-sized list of backends we support, but most importantly we use OpenStack in production for a while now. If you have any questions, just let me know - many of us also watch this list, but apparently we're not so fast in responding here... Cheers, Matthias [0]https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcobald-tardis.readthedocs.io%2Fen%2Flatest%2F&data=04%7C01%7CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626216730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=JcbakrIGgspjjMBwx5NM1cqajaIVh6KvASnnn7GeW%2Bc%3D&reserved=0 [1]https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.epj-conferences.org%2Farticles%2Fepjconf%2Fabs%2F2020%2F21%2Fepjconf_chep2020_07038%2Fepjconf_chep2020_07038.html&data=04%7C01%7CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626216730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=EZ3YCf1H%2B9n5U8JkvZPXra9JT9MpHJIVbg4RIBxj%2Ft8%3D&reserved=0 On 05.09.21 11:38,jcaballero.hep@xxxxxxxxx wrote:Hi Matt, The cloud team at RAL does what you are looking for. Asking in the TB Support list may be helpful as well. Cheers, Jose El sÃb, 4 sept 2021 a las 22:58, West, Matthew (<M.T.West@xxxxxxxxxxxx>) escribiÃ:Hi Tim, I will chat with the GridPP folks this week if I can grab someone's attention as they just had their yearly project meeting. One could just run the HTCondor startd on the bare machines and not fuss with trying to pack things into a VM, but I also wanted to standardize AMAP a setup for workstation pools as well. There are a bunch of systems all over campus that could be wrangled into use and I feel it might be an easier sell than asking for brand-new hardware for HTC. Cheers, Matt ________________________________ From: HTCondor-users<htcondor-users-bounces@xxxxxxxxxxx> on behalf of Steven C Timm<timm@xxxxxxxx> Sent: Saturday, September 4, 2021 8:18 PM To:htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Backfill on an OpenStack system CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe. There are 2 ways it can be done. One is to install the optional EC2 openstack emulator and use the aws features of htcondor to launch virtual machines. The other way is the so called "VAC" system in which there is a daemon running on each cloud node to self-launch a VM that was developed by GridPP in the UK.. basically the idea that the VM's launch out of the "vacuum" and join a htcondor pool. The latter can run on any pool, doesn't necessarily need openstack. I am fairly new to running openstack myself so am not sure if they have the equivalent of VM's that can be pre-empted. but if you have a startd you could use htcondor to condor_off the startd if the VM is needed back and have the VM then programed to exit. HTCondor at one point was going to add a feature to talk directly to the OpenStack "Nova" API but I don't think that it is functional yet,. Steve Timm ________________________________ From: HTCondor-users<htcondor-users-bounces@xxxxxxxxxxx> on behalf of West, Matthew<M.T.West@xxxxxxxxxxxx> Sent: Saturday, September 4, 2021 1:20 PM To:htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: [HTCondor-users] Backfill on an OpenStack system Hi All, Here at Exeter, IT is setting up an OpenStack system to support researchers who want DRAM heavy bespoke workstation-like environments. Because I don't expect the system to be full up with active users 24/7, I am wondering what the optimal way to setup an HTCondor pool on it to run jobs as backfill. Would this be similar to how you would do it for any other spare resources: have a VM start up on a node and announce itself to the collector daemon as an available worker if idle conditions of the machine are met? It reminds me of the method to expand one's resources into corporate cloud servers but I am not sure what tools are useful in this case. Cheers, Matt _______________________________________________ HTCondor-users mailing listTo unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with asubject: Unsubscribe You can also unsubscribe by visiting https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis ts.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users&data=04%7C01 %7CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5 d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7C TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV CI6Mn0%3D%7C1000&sdata=s6%2BgAN%2FlHr1vTCsFLlTBsC2ba5phSVgbsIjiWz kQAzk%3D&reserved=0 The archives can be found at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis ts.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F&data=04%7C01%7CM.T.W est%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984 eeeaf321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7CTWFpbGZs b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3 D%7C1000&sdata=xSZwqymzr8RI7bV8AQGFzAD1fx%2F8YDGIzGtcNlBrLNw%3D&a mp;reserved=0_______________________________________________ HTCondor-users mailing listTo unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with asubject: Unsubscribe You can also unsubscribe by visiting https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist s.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users&data=04%7C01%7 CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77 fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7CTWFp bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn 0%3D%7C1000&sdata=s6%2BgAN%2FlHr1vTCsFLlTBsC2ba5phSVgbsIjiWzkQAzk% 3D&reserved=0 The archives can be found at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist s.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F&data=04%7C01%7CM.T.Wes t%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984eee af321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7CTWFpbGZsb3d8 eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 000&sdata=xSZwqymzr8RI7bV8AQGFzAD1fx%2F8YDGIzGtcNlBrLNw%3D&res erved=0_______________________________________________ HTCondor-users mailing list To unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users&data=04%7C01%7CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=s6%2BgAN%2FlHr1vTCsFLlTBsC2ba5phSVgbsIjiWzkQAzk%3D&reserved=0 The archives can be found at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F&data=04%7C01%7CM.T.West%40exeter.ac.uk%7C504fdcd10b074f90338e08d972147f34%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637666254626226686%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xSZwqymzr8RI7bV8AQGFzAD1fx%2F8YDGIzGtcNlBrLNw%3D&reserved=0 _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature