file transfer error in high throughput computing (HTC) system


Date: Wed, 13 Mar 2019 14:54:07 -0500
From: chtc-users@xxxxxxxxxxx
Subject: file transfer error in high throughput computing (HTC) system
Greetings CHTC users,

This message applies to users of our high throughput computing (HTC) system who also use our SQUID web server to transfer files to their jobs.Â

Starting Monday night, certain HTC jobs that download files from SQUID were mistakenly put on hold, without a hold message. You can see if your jobs were placed on hold without a message by running:
 condor_q -hold
The output for affected jobs will look like this:
 30000.0 username 3/12 12:07 Error from slot1_1@xxxxxxxxxxxxxxxxxx:

This issue, caused by a certain version of HTCondor on a subset of our HTC servers, has been fixed. If your jobs were affected, you can now release them and they should run successfully. To release held jobs, you can use the "condor_release" command and then either your username, or the ClusterID of affected jobs.Â

If you have additional questions, please email us at chtc@xxxxxxxxxxx.

Best,
Your CHTC team
[← Prev in Thread] Current Thread [Next in Thread→]
  • file transfer error in high throughput computing (HTC) system, chtc-users <=