Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DAG condor_schedd crash on windows
- Date: Fri, 23 Sep 2005 17:10:15 +0200
- From: "Horvatth Szabolcs" <szabolcs@xxxxxxxxxxxxx>
- Subject: Re: [Condor-users] DAG condor_schedd crash on windows
>It looks like your job queue log is being corrupted. The stack trace
>you posted is from when the schedd attempted to restart. Can you
>email the stack trace from the initial crash?
Sure, hopefully its at the bottom of this mail.
>It looks like the commands above are being executed inside a script.
>Can you email the exact code and the value of $dagjobid? The exact
>parsing of the arguments is important in debugging a problem like this.
The value of the $clusterID variable is an integer.
This code snippet was run from the script language of Maya:
system ("condor_qedit " + $clusterID + " LeaveJobInQueue FALSE");
system ("condor_qedit -const \"DAGManJobId == \\\"" + $clusterID + "\\\" LeaveJobInQueue FALSE");
system ("condor_rm " + $clusterID);
The strange thing is that the command is executed without problems, the crash happens
afterwards.
Cheers,
Szabolcs
//=====================================================
Exception code: C0000005 ACCESS_VIOLATION
Fault address: 0040A14E 01:0000914E C:\Condor\bin\condor_schedd.exe
Registers:
EAX:00B7C3D4
EBX:00000000
ECX:0000119D
EDX:7C90EB94
ESI:0000119D
EDI:0000119D
CS:EIP:001B:0040A14E
SS:ESP:0023:0012FC08 EBP:0012FC0C
DS:0023 ES:0023 FS:003B GS:0000
Flags:00010206
Call stack:
Address Frame
0040A14E 0012FC0C DestroyProc+1EB
0040A0D4 0012FD34 DestroyProc+171
004116BA 0012FD68 jobIsFinishedDone+3B
0041C4C4 0012FDA0 Scheduler::jobIsFinishedHandler+73
00478442 0012FDB8 SelfDrainingQueue::timerHandler+6C
00485C92 0012FDF4 TimerManager::Timeout+14D
0046FED3 0012FE30 DaemonCore::Driver+B5
00477FA6 0012FF68 dc_main+A44
004780B5 0012FF80 main+CE
0049B9BD 00000001 mainCRTStartup+C5