REMINDER: ALL HPC CLUSTER USER DATA WILL BE DELETED during March 13-16 downtime.


Date: Thu, 22 Feb 2018 17:11:26 -0600
From: chtc-users@xxxxxxxxxxx
Subject: REMINDER: ALL HPC CLUSTER USER DATA WILL BE DELETED during March 13-16 downtime.
Greetings,

This is a weekly reminder that all user data on the HPC Cluster will be deleted during a March 13 downtime in order to rebuild the filesystem. Take action now to ensure that you have a copy of all essential data and software in a non-CHTC location.

To ensure that you continue to receive our emails (which Office 365 may sometimes mark as SPAM), make sure to addÂchtc-users@xxxxxxxxxxxÂto your address book in WiscMail (Office 365). Similar announcements appear on our User News page (http://chtc.cs.wisc.edu/user-news.shtml), in our online guide for the HPC Cluster, and in the HPC Cluster's login message.

Thank you, as always,
Your CHTC Team

Originally sent on Wed, Feb 14, 2018 at 1:07 PM:
Greetings HPC Cluster Users,
Â
(those who only use the HTC System can ignore the below)

The HPC Cluster will be taken down March 13 for a major upgrade of the filesystem.Â
We will be rebuilding the /home location with a new version of the filesystem, which will come with user-level quota limits on total data and filecounts. These changes are intended to improve performance and address bugs in the filesystem software. See our prior email, below, when we first announced plans for the downtime.

THE FILESYSTEM CONTENTS MUST BE COMPLETELY DELETED FOR THE REBUILD AND CANNOT BE SAVED FOR YOU BY CHTC.Â
  1. TAKE ACTION NOWÂto transfer ALL of your data and software within /home. Users waiting until the last minute will risk loss of data when we clear and rebuild the filesystem (if they haven't been properly backing up to a non-CHTC location, all along). We cannot postpone the downtime for such circumstances. See also #2-3.
  2. CHTC is not able to backup or otherwise reinstate ANYÂuserÂdata from the current filesystem and is not responsible for loss of user data when we have to delete it for the rebuild process.
  3. As a reminder, you should have NO DATA on the HPC Cluster that you have not already backed up elsewhereÂ(including software)Âand that you are not ACTIVELY running jobs with. This expected practice has been inÂCHTC's stated policies (http://chtc.cs.wisc.edu/HPCuseguide.shtml)Âsince the HPC Cluster was first introduced in 2013 and will be key for your ability to continue work up until the downtime, only needing to remove data from your most recently-completed jobs in the days before-hand.
  4. Do not have more than one data transfer process occurring through the head nodes (aci-service-1 or aci-service-2) at a time.ÂToo many users with too many data transfers or deletion ('rm') processes will create network and filesystem performance issues for all users. See also #1.
USERS WILL BE ABLE TO REINSTATE THEIR DATA (WITH NEW QUOTAS) AFTER THE DOWNTIME.
  • The new filesystem buildÂwill include a new initial per-user quota ofÂ100 GB of space and 1000 file/directory counts. Researchers needing more than that amount for concurrently-running jobs and/or software files will need to consult with a Research Computing Facilitator viaÂchtc@xxxxxxxxxxxÂafter the downtime, to ensure proper data practices.
  • All CHTC-supported software modules (compilers, MPI versions, and licensed software) will be preserved and reinstated for identical use after the downtime.
  • Software in your home directory that you've copied off of the cluster before the downtime can be reinstated within the same directory location (/home/username/etc).

PLAN ACCORDINGLY FOR A DELAY IN YOUR COMPUTATIONAL WORK
We intend to complete the necessary work by March 16, but it's possible we'll need 1-2 days more or less than that, and we'll announce any timeline changes as soon as we know them. Please plan for a delay of your computational work, accordingly, including time after the downtime to re-establish your data structures.

We appreciate your action and planning in support of our work to minimize interruptions/downtime for all users of the HPC Cluster, though we do need to take necessary actions like this planned downtime to make improvements to cluster components.


As always, please send any questions toÂchtc@xxxxxxxxxxxÂ(rather than replying to this email list for all CHTC users).

Thank you!
Your CHTC Team, care of Lauren Michael
[← Prev in Thread] Current Thread [Next in Thread→]
  • REMINDER: ALL HPC CLUSTER USER DATA WILL BE DELETED during March 13-16 downtime., chtc-users <=