[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Child dag jobs not submitted



Hi,


Sometimes my Dag files don't not submit jobs that are children of an already completed parent job.
The dag's log file says the following:


  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
6/3 15:52:16   ===     ===      ===     ===     ===        ===      ===
6/3 15:52:16   212       0       23       0       0          5        1
6/3 15:52:26 Event: ULOG_EXECUTE for Condor Job RIBGen.0136 (3580.0)
6/3 15:52:26 EVENT ERROR: job 3580.0.0 executing; total end count != 0 (1)
6/3 15:52:26 WARNING: bad event here may indicate a serious bug in Condor -- beware!
6/3 15:52:26 Aborting DAG because of bad event (EVENT ERROR: job 3580.0.0 executing; total end count != 0 (1))
6/3 15:52:26 Aborting DAG...
6/3 15:52:26 Writing Rescue DAG to //sv/directory/file.dag.rescue...
6/3 15:52:26 Removing submitted jobs...
6/3 15:52:26 Removing any/all submitted Condor/Stork jobs...
6/3 15:52:26 Executing: condor_rm -const 'DAGManJobID == "3502"'
6/3 15:52:26 Running: condor_rm -const 'DAGManJobID == "3502"'
6/3 15:52:26 WARNING: failure: condor_rm -const 'DAGManJobID == "3502"'
6/3 15:52:26 	(pclose() returned 1)
6/3 15:52:26 Error removing DAGMan jobs
6/3 15:52:26 **** condor_scheduniv_exec.3502.0 (condor_DAGMAN) EXITING WITH STATUS 1


I'm not really sure what the event error and warning lines mean but it definitely does not work.
Using Condor 6.7.7 on Windows.


Cheers,
Horvatth Szabolcs