Greetings CHTC users,
This message is for users of CHTC’s HTC System, especially those using GPUs.
-
Earlier today, a change in the configuration of CHTC’s GPU nodes resulted in a mismatch between the underlying GPU (CUDA) drivers and libraries. This likely caused jobs to fail. If you had GPU jobs in the queue
today, please check your jobs for failures or holds. The GPU nodes will not execute any GPU-dependent jobs until we have resolved this configuration issue.
-
The HTC system upgrades that began last Wednesday (5/4) and may have caused jobs to experience lower throughput are complete. Excluding GPU nodes, the HTC cluster has returned to full capacity.
Contact us with questions at
chtc@xxxxxxxxxxx, especially if you are seeing unexplained errors from recent jobs that could
be related to the above.
Best,
Your CHTC Team