The following is the dagman log file, DAGman failed to detect a node's status, seems because it could not read its log. I googled in the user-mail-list, and found it maybe caused by NFS, and then I set NFS=YES in global configuration. Besides, this dir is not exported by NFS. But it still failed, any hint?
Thanks.
12/20 22:01:15 Pending DAG nodes:
12/20 22:01:15 Node A6, Condor ID 391, status STATUS_SUBMITTED
12/20 22:10:55 Currently monitoring 1 Condor log file(s)
12/20 22:11:01 Currently monitoring 1 Condor log file(s)
12/20 22:11:02 ReadMultipleUserLogs: read error on log /media/DawnBook2/072809_s36d5fab_burned/msa_dawnsong/runabc6-tight.sh.log
12/20 22:11:02 ERROR: failure to read job log
A log event may be corrupt. DAGMan will skip the event and try to
continue, but information may have been lost. If DAGMan exits
unfinished, but reports no failed jobs, re-submit the rescue file
to complete the DAG.
12/20 22:21:03 602 seconds since last log event
12/20 22:21:03 Pending DAG nodes:
12/20 22:21:03 Node A6, Condor ID 391, status STATUS_SUBMITTED