I recently ran a batch of job, just shy of 4000 in total. When it was done I got this:
OWNER Â BATCH_NAME Â Â ÂSUBMITTED Â DONE Â RUN Â ÂIDLE Â HOLD ÂTOTAL JOB_IDS
jfisher   ÂCMD: ngspice    Â6/7 Â22:30    Â1787   Â_      _    Â9     Â1800 261.0 ... 262.4
9 jobs; 0 completed, 0 removed, 0 idle, 0 running, 9 held, 0 suspended
Running condor_release restarted the jobs, but then something crashes and the jobs go back to being held.
then:
condor_q -hold
ÂID Â Â ÂOWNER Â Â Â Â ÂHELD_SINCE ÂHOLD_REASON
Â261.0  jfisher     6/14 14:03     ÂError from slot1_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â261.1  jfisher     6/14 14:03     ÂError from slot2_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â261.2  jfisher     6/14 14:03     ÂError from slot3_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â261.3  jfisher     6/14 14:03     ÂError from slot4_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â262.0  jfisher     6/14 14:03     ÂError from slot5_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â262.1  jfisher     6/14 14:03     ÂError from slot6_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â262.2  jfisher     6/14 14:03     ÂError from slot1_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â262.3  jfisher     6/14 14:03     ÂError from slot2_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Â262.4  jfisher     6/14 14:03     ÂError from slot3_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
Alas the truncation is right where I suspect the information I need is going to be.
Any ideas as to how to find out what those jobs are?
--
Kind regards,
Justin Fisher.