Slurm capability restored on the HPC cluster

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Date:	Thu, 16 Sep 2021 17:12:03 -0500
From:	chtc-users@xxxxxxxxxxx
Subject:	Slurm capability restored on the HPC cluster

Greetings CHTC users,

This message is for users of our high performance computing (HPC) cluster.

Slurm functionality is back to normal on the HPC cluster.Â

Jobs that were started before the outage (~6am this morning) may have continued running. Regardless of whether a job is still running or not, we strongly recommend checking the output of HPC cluster jobs that ran or completed during the outage (between 6am this morning and 4pm this afternoon).Â

Best,

Your CHTC Team

---------- Forwarded message ---------
From: chtc-users--- via CHTC-users <chtc-users@xxxxxxxxxxx>
Date: Thu, Sep 16, 2021 at 10:53 AM
Subject: Slurm commands, scheduling down on the HPC cluster
To: chtc-users <chtc-users@xxxxxxxxxxx>
Cc: <chtc-users@xxxxxxxxxxx>

Greetings CHTC users,

This message is for users of our high performance computing (HPC) cluster.

Earlier this morning we identified a hardware failure affecting the HPC Clusterâs head node, which hosts the Slurm job scheduler. As such, most Slurm commands (such as squeue, sinfo, sbatch, sacct, et cetera) will not function for now.Â Job scheduling is also currently offline. You can still log in and view files through the cluster login nodes.Â

Our staff are working on the problem and we will send an update to this email list once the issue is resolved and Slurm functionality is back to normal. Please email us at chtc@xxxxxxxxxxx with any questions or issues.Â

Best,

Your CHTC Team

_______________________________________________
CHTC-users mailing list
CHTC-users@xxxxxxxxxxx
To unsubscribe send an email to:
chtc@xxxxxxxxxxx

[← Prev in Thread]	Current Thread	[Next in Thread→]
Slurm capability restored on the HPC cluster, chtc-users <=

Previous by Date:	Slurm commands, scheduling down on the HPC cluster, chtc-users
Next by Date:	, (nil)
Previous by Thread:	, (nil)
Next by Thread:	Slurm commands, scheduling down on the HPC cluster, chtc-users
Indexes:	[Date] [Thread]

Mailing List Archives

Authenticated access

Slurm capability restored on the HPC cluster