Hi Todd,
I think the reason is failure to transfer and write files
from the node
to the central manager, to
"C:\condor\spool\[ClusterID]\[JobID]\cluster[ClusterID].proc[JobID].subproc0.tmp\",
but I don't know what could be causing this...
Here's the submit file:
-------------------------------------
-------------------------------------
universe = vanilla
executable =HALF_3D_barreira_abso.exe
should_transfer_files = YES
When_To_Transfer_Output = ON_EXIT_OR_EVICT
transfer_input_files = data
input = in.$(Process)
output = out.$(Process)
error = error.txt
log = log.txt
#Requirements = (OpSys == "WINNT50") || \
# (OpSys == "WINNT51")
notify_user = never
queue 256
--------------------------------------------------------------------------
And bellow is the output of condor_q -hold
The Job does have an error log file, but it has zero bytes (it's empty),
and the nodes where the held jobs where running are completly random.
Sometimes it's 4, 9, 11,
...
Thanks for your help!
Alexandre
Here's the output of condor_q -hold:
-----------------------------------------------------------------------------------------------------
C:\Users\Administrator>condor_q -hold
-- Submitter: condor : <10.2.0.70:49200> : condor
ID OWNER HELD_SINCE HOLD_REASON
6.209 julieta 10/3 19:29 Error from slot15@cluster05: STARTER
at 10.2.0.55 failed to send file(s) to <10.2.0.70:54887>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\6\209\cluster6.proc209.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
6.236 julieta 10/3 19:29 Error from slot24@cluster05: STARTER
at 10.2.0.55 failed to send file(s) to <10.2.0.70:54964>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\6\236\cluster6.proc236.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
6.248 julieta 10/3 19:29 Error from slot5@cluster02: STARTER
at 10.2.0.52 failed to send file(s) to <10.2.0.70:54828>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\6\248\cluster6.proc248.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.11 julieta 10/3 19:28 Error from slot9@cluster03: STARTER
at 10.2.0.53 failed to send file(s) to <10.2.0.70:61012>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\11\cluster8.proc11.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.35 julieta 10/3 19:29 Error from slot17@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:58240>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\35\cluster8.proc35.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.50 julieta 10/3 19:28 Error from slot2@cluster03: STARTER
at 10.2.0.53 failed to send file(s) to <10.2.0.70:61135>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\50\cluster8.proc50.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.67 julieta 10/3 19:28 Error from slot20@cluster03: STARTER
at 10.2.0.53 failed to send file(s) to <10.2.0.70:61426>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\67\cluster8.proc67.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.71 julieta 10/3 19:29 Error from slot23@cluster03: STARTER
at 10.2.0.53 failed to send file(s) to <10.2.0.70:61990>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\71\cluster8.proc71.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.77 julieta 10/3 19:29 Error from slot11@cluster08: STARTER
at 10.2.0.58 failed to send file(s) to <10.2.0.70:58267>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\77\cluster8.proc77.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.79 julieta 10/3 19:29 Error from slot13@cluster08: STARTER
at 10.2.0.58 failed to send file(s) to <10.2.0.70:58364>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\79\cluster8.proc79.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.83 julieta 10/3 19:29 Error from slot17@cluster08: STARTER
at 10.2.0.58 failed to send file(s) to <10.2.0.70:58483>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\83\cluster8.proc83.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.95 julieta 10/3 19:19 Error from slot17@cluster02: STARTER
at 10.2.0.52 failed to send file(s) to <10.2.0.70:62057>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\95\cluster8.proc95.subproc0.tmp\_condor_stdout: (errno
2) No such file or directory
8.110 julieta 10/3 19:28 Error from slot9@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62137>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\110\cluster8.proc110.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.114 julieta 10/3 19:28 Error from slot10@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62142>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\114\cluster8.proc114.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.116 julieta 10/3 19:28 Error from slot12@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62145>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\116\cluster8.proc116.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.121 julieta 10/3 19:29 Error from slot7@cluster11: STARTER
at 10.2.0.61 failed to send file(s) to <10.2.0.70:58438>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\121\cluster8.proc121.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.123 julieta 10/3 19:29 Error from slot9@cluster11: STARTER
at 10.2.0.61 failed to send file(s) to <10.2.0.70:58468>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\123\cluster8.proc123.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.143 julieta 10/3 19:28 Error from slot20@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62418>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\143\cluster8.proc143.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.146 julieta 10/3 19:29 Error from slot8@cluster06: STARTER
at 10.2.0.56 failed to send file(s) to <10.2.0.70:58610>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\146\cluster8.proc146.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.162 julieta 10/3 19:28 Error from slot2@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62499>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\162\cluster8.proc162.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.163 julieta 10/3 19:29 Error from slot7@cluster10: STARTER
at 10.2.0.60 failed to send file(s) to <10.2.0.70:62504>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\163\cluster8.proc163.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.168 julieta 10/3 19:28 Error from slot3@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62526>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\168\cluster8.proc168.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.178 julieta 10/3 19:29 Error from slot6@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62562>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\178\cluster8.proc178.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.186 julieta 10/3 19:29 Error from slot24@cluster09: STARTER
at 10.2.0.59 failed to send file(s) to <10.2.0.70:59167>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\186\cluster8.proc186.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.194 julieta 10/3 19:29 Error from slot8@cluster01: STARTER
at 10.2.0.51 failed to send file(s) to <10.2.0.70:58938>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\194\cluster8.proc194.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.199 julieta 10/3 19:29 Error from slot10@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62579>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\199\cluster8.proc199.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.202 julieta 10/3 19:29 Error from slot5@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62607>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\202\cluster8.proc202.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.215 julieta 10/3 19:29 Error from slot15@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62699>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\215\cluster8.proc215.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.218 julieta 10/3 19:29 Error from slot18@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62744>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\218\cluster8.proc218.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.222 julieta 10/3 19:29 Error from slot22@cluster07: STARTER
at 10.2.0.57 failed to send file(s) to <10.2.0.70:62807>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\222\cluster8.proc222.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.235 julieta 10/3 19:29 Error from slot7@cluster04: STARTER
at 10.2.0.54 failed to send file(s) to <10.2.0.70:62749>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\235\cluster8.proc235.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.237 julieta 10/3 19:29 Error from slot9@cluster04: STARTER
at 10.2.0.54 failed to send file(s) to <10.2.0.70:62780>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\237\cluster8.proc237.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.242 julieta 10/3 19:29 Error from slot14@cluster04: STARTER
at 10.2.0.54 failed to send file(s) to <10.2.0.70:62873>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\242\cluster8.proc242.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.243 julieta 10/3 19:29 Error from slot15@cluster04: STARTER
at 10.2.0.54 failed to send file(s) to <10.2.0.70:62880>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\243\cluster8.proc243.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.249 julieta 10/3 19:29 Error from slot21@cluster04: STARTER
at 10.2.0.54 failed to send file(s) to <10.2.0.70:62937>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\249\cluster8.proc249.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.252 julieta 10/3 19:29 Error from slot2@cluster11: STARTER
at 10.2.0.61 failed to send file(s) to <10.2.0.70:62860>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\252\cluster8.proc252.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
8.255 julieta 10/3 19:29 Error from slot12@cluster11: STARTER
at 10.2.0.61 failed to send file(s) to <10.2.0.70:62822>; SHADOW
at 10.2.0.70 failed to write to file
C:\condor\spool\8\255\cluster8.proc255.subproc0.tmp\_condor_stdout:
(errno 2) No such file or directory
37 jobs; 0 completed, 0 removed, 0 idle, 0 running, 37 held, 0 suspended
-----------------------------------------------------------------------------------------------------