[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Lazy jobs that never really start running



On 7/7/05, Horvatth Szabolcs <szabolcs@xxxxxxxxxxxxx> wrote:
> 7/7 13:16:03 Shadow pid 3788 for job 667.0 exited with status 108
> 7/7 13:16:03 Scheduler::Relinquish - mrec is NULL, can't relinquish
> 7/7 13:16:03 Null parameter --- match not deleted

status 108 is (from  http://www.cs.wisc.edu/~adesmet/status.html)

108  JOB_NOT_STARTED  Can't connect to startd or request refused  

If you look at the startd log on the machine which job 667.0 was
matched(should say in its user log or further up the schedd log) it
might say why this was the case.

The subsequent NULL / relinquish messages I wouldn't know about sorry.

It is worth noting that you have 5 jobs all fiinishing at the same
time - how rapidly do you churn through jobs?

Matt