| Mailing List ArchivesAuthenticated access |  | ![[Computer Systems Lab]](http://www.cs.wisc.edu/pics/csl_logo.gif)  | 
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] DAGMan Hangs Near End
- Date: Fri, 28 Sep 2012 13:27:10 -0500
- From: Oren Livne <livne@xxxxxxxxxxxx>
- Subject: [Condor-users] DAGMan Hangs Near End
Dear All,
I have a DAGMan pipeline that starts fine, but never completes, because 
the last few jobs are queued but never run. A down-scaled version of it 
works, so I doubt that it's a programming error. There are many 
available nodes; why won't those jobs run? How can I analyze the 
individual job within the DAGMan that says "Queued"?
Thank you so much,
Oren
-- Submitter: ibicluster.uchicago.cc : <172.16.0.149:42470> : 
ibicluster.uchicago.cc
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 904.0   livne           9/28 13:09   0+00:15:40 R  0 7.3  
condor_dagman -f -
1 jobs; 0 idle, 1 running, 0 held
===================================================================================
                     Total Owner Claimed Unclaimed Matched Preempting 
Backfill
        X86_64/LINUX   728   108       0       620 0          0        0
               Total   728   108       0       620 0          0        0
===================================================================================
9/28 13:23:33 Event: ULOG_EXECUTE for Condor Node D_chr10 (1009.0)
9/28 13:23:33 Number of idle job procs: 1
9/28 13:23:43 Event: ULOG_JOB_TERMINATED for Condor Node D_chr10 (1009.0)
9/28 13:23:43 Node D_chr10 job proc (1009.0) completed successfully.
9/28 13:23:43 Node D_chr10 job completed
9/28 13:23:43 Number of idle job procs: 1
9/28 13:23:43 Of 107 nodes total:
9/28 13:23:43  Done     Pre   Queued    Post   Ready Un-Ready   Failed
9/28 13:23:43   ===     ===      ===     ===     === ===      ===
9/28 13:23:43   104       0        1       0 0          2        0
--
A person is just about as big as the things that make him angry.