On 5/16/2017 6:57 AM, çå wrote:
Hi I installed two GPUs on one machine,and want to bind each GPU to a slot,for example, There are GPU0 and GPU1, and i modified condor config, slot1@yffs and slot2@yffs can be found on the same machine. I want all the jobs running by condor (condor_exec) get the slot number of which it is running on, So that I can manual bind the GPU with the slot. Is there any way to bind each GPU to a slot? Or just get the slot number when condor_exec running? Thanks very much
Append the following line to the condor_config on your execute machines: use feature : GPUsBy default, this will evenly distribute GPU devices across your static slots, and also set up the environment variables for CUDA (and i think OpenCL) to bind programs running on that slot to the device. In other words, I think adding this one line will do exactly what you want.
More details and other options can be found in the HTCondor HOWTO recipes; specifically see
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToManageGpus regards, Todd