Hi,We have managed to get some promising results with this by manually setting `AssignedGPUs`, `GPUs`, `TotalGPUs` and `TotalSlotGPUs`.
If we set these to 0, the slot stops accepting any more GPU jobs. Now, the issue is that `condor_update_machine_ad` affects all the slots, so it has some side effects because it also affect the "children" slots: if we change the `GPUs` of the children slots it seems it thinks each of them is using all the cards.
Is there any way to limit it to the parent slot? Best, Joan On 14/5/20 20:28, Todd L Miller wrote:
Are you saying that changing OFFLINE_MACHINE_RESOURCE_<name> in the config and then running condor_reconfig does not take the GPU offline?If it does not, I would consider that bug.ÂÂÂÂIt this doesn't work, condor_update_machine_ad might be a work-around. - ToddM _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
-- Dr. Joan Josep Piles-Contreras ZWE Scientific Computing Max Planck Institute for Intelligent Systems (p) +49 7071 601 1750
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature