After not being able to submit a dagman job from windows to linux, I tried
using SSH to start the job directly on the scheduler machine.
This almost works, but the submission of the jobs by dagman fail and I can't
find why. This is the only part of the dagman log that looks relevant, but I
don't really understand what it means:
03/10/11 17:48:15 FileLock::obtain(2) - @1299775695.807126 lock on
/tmp/condorLocks/31/99/338834.lockc now UNLOCKED
03/10/11 17:48:15 init: Opening file
/U/projects/shots_comp/435/1299582026/logs/_dagLog.log
03/10/11 17:48:15 Opening log file #0
'/U/projects/shots_comp/435/1299582026/logs/_dagLog.log'(is_lock_cur=false,seek=false,read_header=true)
03/10/11 17:48:15 Error, apparently invalid user log file
03/10/11 17:48:15 Error, apparently invalid user log file
03/10/11 17:48:15 ReadUserLogHeader::Read(): readEvent() failed
03/10/11 17:48:15 /U/projects/shots_comp/435/1299582026/logs/_dagLog.log:
Failed to read file header
03/10/11 17:48:15 ReadMultipleUserLogs: added log file
/U/projects/shots_comp/435/1299582026/logs/_dagLog.log (21:404744540) to
active list
03/10/11 17:48:15 Submitting Condor Node Nuke_0559_0562 job(s)...
03/10/11 17:48:15 TmpDir(99)::TmpDir()
03/10/11 17:48:15 TmpDir(99)::Cd2TmpDir()
03/10/11 17:48:15 submitting: condor_submit -a dag_node_name' '='
'Nuke_0559_0562 -a +DAGManJobId' '=' '75 -a DAGManJobId' '=' '75 -a
submit_event_notes' '=' 'DAG' 'Node:' 'Nuke_0559_0562 -a +DAGParentNodeNames'
'=' '""
/U/projects/shots_comp/435/1299582026/submitscripts/Nuke_0559_0562.submit
03/10/11 17:48:15 failed while reading from pipe.
03/10/11 17:48:15 Read so far:
03/10/11 17:48:15 ERROR: submit attempt failed
03/10/11 17:48:15 submit command was: condor_submit -a dag_node_name' '='
'Nuke_0559_0562 -a +DAGManJobId' '=' '75 -a DAGManJobId' '=' '75 -a
submit_event_notes' '=' 'DAG' 'Node:' 'Nuke_0559_0562 -a +DAGParentNodeNames'
'=' '""
/U/projects/shots_comp/435/1299582026/submitscripts/Nuke_0559_0562.submit
"Error, apparently invalid user log file"? What can I do to solve the issue
or find the root of the problem?