Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DAG condor_schedd crash on windows
- Date: Thu, 22 Sep 2005 11:45:28 -0500
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] DAG condor_schedd crash on windows
On Sep 22, 2005, at 6:47 AM, Horvatth Szabolcs wrote:
I constantly receive condor_schedd crash error emails when a dagman
scheduler job
that had been set to stay in queue is removed from the queue. (On a
windows computer.)
I use the following command to remove the whole DAG:
{
// Set scheduler task "removeable"
condor_qedit $dagjobid LeaveJobInQueue FALSE")
// Set all tasks "removeable"
condor_qedit -const "DAGManJobId == $dagjobid" LeaveJobInQueue FALSE
condor_rm $dagjobid
The crash happens every time, but the jobs are removed nicely.
It looks like your job queue log is being corrupted. The stack trace
you posted is from when the schedd attempted to restart. Can you
email the stack trace from the initial crash?
It looks like the commands above are being executed inside a script.
Can you email the exact code and the value of $dagjobid? The exact
parsing of the arguments is important in debugging a problem like this.
+----------------------------------+---------------------------------+
| Jaime Frey | Public Split on Whether |
| jfrey@xxxxxxxxxxx | Bush Is a Divider |
| http://www.cs.wisc.edu/~jfrey/ | -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+