Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Held jobs: unable to establish standard (output|error) stream
- Date: Thu, 07 Jan 2021 09:09:40 +0100
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: [HTCondor-users] Held jobs: unable to establish standard (output|error) stream
Good morning,
in my pool, there's a couple of jobs going into Hold state, with the HoldReason(s)
shown in the subject.
The last log lines look like this:
...
012 (347486.000.000) 01/07 09:00:58 Job was held.
Error from slot1_5@xxxxxxxxxxxxxxxxxxx: unable to establish standard output stream
Code 9 Subcode 0
...
012 (347487.000.000) 01/07 09:00:58 Job was held.
Error from slot1_3@xxxxxxxxxxxxxxxxxxx: unable to establish standard output stream
Code 9 Subcode 0
...
007 (347485.000.000) 01/07 09:00:58 Shadow exception!
Error from slot1_4@xxxxxxxxxxxxxxxxxxx: unable to establish standard output stream
0 - Run Bytes Sent By Job
2469644 - Run Bytes Received By Job
...
012 (347485.000.000) 01/07 09:00:58 Job was held.
Error from slot1_4@xxxxxxxxxxxxxxxxxxx: unable to establish standard output stream
Code 9 Subcode 0
...
I ran "condor_q -l $jobid | egrep '^(UserLog|Out|Err)'" and checked the existence of the
files on all pool nodes (inlcuding the head nodes) - nothing suspicious.
How to further debug this? Do I have a gaping black hole in the pool (that only affects
this particular user), is there something in the submit file (which I haven't found yet)
that's different from everything else? condor_release doesn't reset the error state...
Any suggestion is appreciated.
condor_version is 8.8.3 on the HN, and 8.8.x (x >= 3) anywhere else (due to scattered
reinstalls, a full update currently isn't possible)
Thanks,
S
--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~