Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] DAG condor_schedd crash on windows
- Date: Thu, 22 Sep 2005 13:47:43 +0200
- From: "Horvatth Szabolcs" <szabolcs@xxxxxxxxxxxxx>
- Subject: [Condor-users] DAG condor_schedd crash on windows
I constantly receive condor_schedd crash error emails when a dagman scheduler job
that had been set to stay in queue is removed from the queue. (On a windows computer.)
I use the following command to remove the whole DAG:
{
// Set scheduler task "removeable"
condor_qedit $dagjobid LeaveJobInQueue FALSE")
// Set all tasks "removeable"
condor_qedit -const "DAGManJobId == $dagjobid" LeaveJobInQueue FALSE
condor_rm $dagjobid
The crash happens every time, but the jobs are removed nicely.
Cheers,
Szabolcs
---
Just an example:
This is an automated email from the Condor system
on machine "snoopy.digicpictures.local". Do not reply.
"C:\Condor/bin/condor_schedd.exe" on "snoopy.digicpictures.local" died due to exception ACCESS_VIOLATION.
Condor will automatically restart this process in 17 seconds.
*** Last 20 line(s) of file SchedLog:
8/15 10:08:40 ** $CondorPlatform: INTEL-WINNT50 $
8/15 10:08:40 ** PID = 3564
8/15 10:08:40 ******************************************************
8/15 10:08:40 Using config file: C:\Condor\condor_config
8/15 10:08:40 Using local config files: C:\Condor/condor_config.local
8/15 10:08:40 DaemonCore: Command Socket at <192.168.0.71:1122>
8/15 10:08:41 "C:\Condor/bin/condor_shadow.pvm -classad" did not produce any output, ignoring
8/15 10:08:41 "C:\Condor/bin/condor_shadow.std -classad" did not produce any output, ignoring
8/15 10:09:09 ******************************************************
8/15 10:09:09 ** condor_schedd.exe (CONDOR_SCHEDD) STARTING UP
8/15 10:09:09 ** C:\Condor\bin\condor_schedd.exe
8/15 10:09:09 ** $CondorVersion: 6.7.9 Jul 14 2005 $
8/15 10:09:09 ** $CondorPlatform: INTEL-WINNT50 $
8/15 10:09:09 ** PID = 3264
8/15 10:09:09 ******************************************************
8/15 10:09:09 Using config file: C:\Condor\condor_config
8/15 10:09:09 Using local config files: C:\Condor/condor_config.local
8/15 10:09:09 DaemonCore: Command Socket at <192.168.0.71:1150>
8/15 10:09:09 "C:\Condor/bin/condor_shadow.pvm -classad" did not produce any output, ignoring
8/15 10:09:09 "C:\Condor/bin/condor_shadow.std -classad" did not produce any output, ignoring
*** End of file SchedLog
*** Last entry in core file core.SCHEDD.WIN32
================================
Exception code: C0000005 ACCESS_VIOLATION
Fault address: 0049B018 01:0009A018 C:\Condor\bin\condor_schedd.exe
Registers:
EAX:000000FF
EBX:00000000
ECX:018AF5D0
EDX:0052F880
ESI:29300030
EDI:00971410
CS:EIP:001B:0049B018
SS:ESP:0023:0012F420 EBP:0012F430
DS:0023 ES:0023 FS:003B GS:0000
Flags:00010286
Call stack:
Address Frame
0049B018 0012F430 stricmp+88
0046242F 0012F444 AttrList::Lookup+1F
0046240D 0012F44C AttrList::Lookup+9
004624F2 0012F454 AttrList::Lookup+C
0046216B 0012F474 AttrList::Insert+34
0046211C 0012F48C AttrList::Insert+2E
004496F6 0012F4BC LogSetAttribute::Play+8F
00448861 0012F4E4 ClassAdLog::ClassAdLog+CD
00446D42 0012F52C ClassAdCollection::ClassAdCollection+22
00409343 0012FB90 InitJobQueue+60
0041F321 0012FE28 main_init+131
004778B6 0012FF5C dc_main+A26
004B6C26 00000001 EnumProcessModules+3D02
*** End of file core.SCHEDD.WIN32