HPC cluster, certain HTC servers down (Oct 23)


Date: Wed, 23 Oct 2019 16:28:35 -0500
From: chtc-users@xxxxxxxxxxx
Subject: HPC cluster, certain HTC servers down (Oct 23)

Hi everyone,


We had a brief cooling outage in one of our server rooms this afternoon (Wed., Oct 23), resulting in many servers being automatically shut off. Impacts include:


HPC Cluster:

HTC System:

On both systems:

  • All jobs should still be present in the relevant queue; interrupted jobs should simply re-run once we start our execute servers.

We are currently starting the process of rebooting all affected servers. We will send another email to this address when this process is complete.


Please email our usual help address (chtc@xxxxxxxxxxx) with any questions or concerns!


Thanks,

Your CHTC team

[← Prev in Thread] Current Thread [Next in Thread→]
  • HPC cluster, certain HTC servers down (Oct 23), chtc-users <=