Date: | Wed, 2 Dec 2020 13:34:24 -0600 |
---|---|
From: | chtc-users@xxxxxxxxxxx |
Subject: | HPC Cluster queue and execute nodes are down due to network issues |
Greetings,
This message is for users of CHTC's HPC Cluster. Users of only the HTC System can ignore. We are currently working to understand and fix a networking issue affecting many of the execute nodes in the HPC Cluster, as well as the server that operates the queue. As a result of this outage, the cluster's queue and all Slurm commands are failing, though users are still able to log into the main head node (hpclogin1.chtc.wisc.edu). The full extent of impact to queued jobs is yet unclear. While we are still investigating on-site, we are unsure of how long it will take to diagnose and fix the issue, and to restore the cluster to functionality. We appreciate your patience, and will provide updates with any changes to functionality or timeline. Thank you, Your CHTC Team |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | , (nil) |
---|---|
Next by Date: | HPC Cluster queue restored; users advised to proceed with caution, chtc-users |
Previous by Thread: | CHTC Facilitator holiday availability, chtc-users |
Next by Thread: | HPC Cluster queue restored; users advised to proceed with caution, chtc-users |
Indexes: | [Date] [Thread] |