Hi all,
we are trying to set up HTCondor to schedule our Deep-Learning
research. First, we would like to have 3 user-groups of ascending priority:
Which should receive different quotas, accepting Surplus. After reading the documentation I figure this will be easily implemented using hierarchical groups with quotas accepting surplus (anything non-obvious to consider here?).
However, we would also like to factor the GPU as a resource in
the priority calculation scheme (see section 3.6.4).
Unfortunately we did not find any way to access the formula and
directly factor resource into the calculation scheme. Is this
possible at all? As an alternative workaround, we found out that while SLOT_WEIGHT may not be set to a custom resource (see p.257 of the docs for release 8.8), we should be able to always set it to 1 by setting NEGOTIATOR_USE_SLOT_WEIGHTS to FALSE (see p.298). Further, we are able to add availability of GPUs as a custom resource into the consumption policy (section 3.7.1, p. 391). However, we are unsure what effect this would have on the resources required by the job. Would 1 GPU requested now count as 1 resource (by applying quantize(target.RequestGpus,{1} for example)? E.g. an overall job costing 2 resources CPU and 1 resource Memory would, with the addition of a consumption policy for GPUs, now also cost 1 resource GPU? Does this workaround seem feasible to you? Any other ideas on
how to get the priority calculation focused on GPU usage as a
resource? Best and thanks for taking the time to read this! |