Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] GPU memory request
- Date: Mon, 25 Mar 2024 18:32:35 +0000
- From: Zach McGrew <mcgrewz@xxxxxxx>
- Subject: Re: [HTCondor-users] GPU memory request
You'll want something like this:
require_gpus = GlobalMemoryMb >= 2048
To request a GPU with at least 2GB of GPU memory. The gpus_minimum_discovery is only in the 23.x feature branch I believe, not the 23.0 LTS or 10.9.
-Zach
________________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Weatherby,Gerard <gweatherby@xxxxxxxx>
Sent: Monday, March 25, 2024 10:40 AM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] GPU memory request
You don't often get email from gweatherby@xxxxxxxxx Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
*** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***
This seems to work on the 23 nodes:
Universe = vanilla
gpus_minimum_memory = 1MB
request_gpus = 1
Executable = /usr/bin/echo
Arguments = hello compute
output = h100.txt
error = h100.err
Log = h100.log
however, thereâs a warning
WARNING: the line 'gpus_minimum_memory = 1MB' was unused by condor_submit. Is it a typo?
Is that just a condor_submit bug?
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Todd L Miller via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Date: Monday, March 25, 2024 at 12:08âPM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Todd L Miller <tlmiller@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] GPU memory request
*** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***
> Weâre running a 10.9 / 23 cluster and using
>
> use feature: GPUs
>
> How does a user request a certain amount of GPU memory?
For recent releases:
https://urldefense.com/v3/__https://htcondor.readthedocs.io/en/latest/man-pages/condor_submit.html*gpus_minimum_memory__;Iw!!Cn_UX_p3!ldAOUS0h-q3CeQ7kXWVRzVV2rYk2DMGydSduBCmjzVfD56nUmfzrVx2-DhPHylqo2vW__YGH72WtaDyiKJm2vJ0i8vd861Xg$<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fhtcondor.readthedocs.io%2Fen%2Flatest%2Fman-pages%2Fcondor_submit.html*gpus_minimum_memory__%3BIw!!Cn_UX_p3!ldAOUS0h-q3CeQ7kXWVRzVV2rYk2DMGydSduBCmjzVfD56nUmfzrVx2-DhPHylqo2vW__YGH72WtaDyiKJm2vJ0i8vd861Xg%24&data=05%7C02%7Cmcgrewz%40wwu.edu%7Cc47767f245a840b627d808dc4cf8c8f3%7Cdc46140ce26f43efb0ae00f257f478ff%7C0%7C0%7C638469878631954180%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=XKGgnLFJTAhvoO9VZKvlxYDMKgSx%2BjVNum5pYdSan8E%3D&reserved=0>
For older releases, you'll have to write an expression:
https://urldefense.com/v3/__https://htcondor.readthedocs.io/en/v10_0/man-pages/condor_submit.html*index-60__;Iw!!Cn_UX_p3!ldAOUS0h-q3CeQ7kXWVRzVV2rYk2DMGydSduBCmjzVfD56nUmfzrVx2-DhPHylqo2vW__YGH72WtaDyiKJm2vJ0i8ukBFdTk$<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fhtcondor.readthedocs.io%2Fen%2Fv10_0%2Fman-pages%2Fcondor_submit.html*index-60__%3BIw!!Cn_UX_p3!ldAOUS0h-q3CeQ7kXWVRzVV2rYk2DMGydSduBCmjzVfD56nUmfzrVx2-DhPHylqo2vW__YGH72WtaDyiKJm2vJ0i8ukBFdTk%24&data=05%7C02%7Cmcgrewz%40wwu.edu%7Cc47767f245a840b627d808dc4cf8c8f3%7Cdc46140ce26f43efb0ae00f257f478ff%7C0%7C0%7C638469878631969773%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=0Hm7nOyS3PNSOlNNr6Qimy4Cp4rhrG9rHHSq35aQXrY%3D&reserved=0>
-- ToddM