Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Bad event error in condor DAG
- Date: Thu, 1 Sep 2005 15:02:38 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Bad event error in condor DAG
On Thu, 25 Aug 2005, Alexander Dietz wrote:
> I ran a DAG on a cluster with Fedora Core 3 (Heidelberg) and using
> condor version 6.7.10, but I always get a bad event error:
>
> 8/23 09:14:29 EVENT ERROR: job 1782278.0.0 ended; total end count != 1 (2)
> 8/23 09:14:29 WARNING: bad event here may indicate a serious bug in
> Condor -- beware!
> 8/23 09:14:29 Continuing with DAG in spite of bad event (EVENT ERROR:
> job 1782278.0.0 ended; total end count != 1 (2)) because of allow_events
> setting
I took a look at your dagman.out file, and I now know what the problem
is. Sometimes, when a node job aborts, Condor writes both a terminated
and an aborted event in the job log. This is actually a bug in Condor.
In this case, it isn't actually hurting anything, so you can ignore the
warnings. (If you see a ULOG_JOB_TERMINATED event followed immediately
by a ULOG_JOB_ABORTED for the same job, don't worry about the warnings.)
Kent Wenger
Condor Team