Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] ntdll.dll access_violation on windows xp sp2 + data execution prevention
- Date: Fri, 11 May 2007 19:30:41 +0200
- From: Rob de Graaf <rob@xxxxxxxxxxxxxxxxxx>
- Subject: [Condor-users] ntdll.dll access_violation on windows xp sp2 + data execution prevention
Hello,
I've set up a test environment including a central manager running
Slackware linux and an execute only node running Windows XP service pack
2, both using condor 6.8.4. Both machines authenticate using kerberos
and I have successfully run some test jobs.
However, on the Windows execute node, the condor daemons don't always
start properly. Sometimes the condor_master will fail, sometimes the
condor_startd will fail, sometimes they both fail. When they go down
they leave a core file in the log directory. This one is from when the
condor_master failed to start:
core.MASTER.win32
//=====================================================
Exception code: C0000005 ACCESS_VIOLATION
Fault address: 7C93426D 01:0003326D C:\WINDOWS\system32\ntdll.dll
Registers:
EAX:FFFFFFFF
EBX:00000362
ECX:000320F0
EDX:00000000
ESI:000303A8
EDI:000305D8
CS:EIP:001B:7C93426D
SS:ESP:0023:00CDF154 EBP:00CDF374
DS:0023 ES:0023 FS:003B GS:0000
Flags:00010246
Call stack:
Address Frame Logical addr Module
7C93426D 00CDF374 0001:0003326D C:\WINDOWS\system32\ntdll.dll
77C2C3C9 00CDF3B4 0001:0001B3C9 C:\WINDOWS\system32\msvcrt.dll
77C2C3E7 00CDF3C0 0001:0001B3E7 C:\WINDOWS\system32\msvcrt.dll
77C2C42E 00CDF3D0 0001:0001B42E C:\WINDOWS\system32\msvcrt.dll
0034E510 00CDF3EC 0001:0000D510 C:\condor\bin\krb5_32.dll
00342FD2 00CDF420 0001:00001FD2 C:\condor\bin\krb5_32.dll
003824C7 00CDF638 0001:000414C7 C:\condor\bin\krb5_32.dll
00464C76 00CDF684 0001:00063C76 C:\condor\bin\condor_master.exe
0046440F 00CDF698 0001:0006340F C:\condor\bin\condor_master.exe
0045FDCC 00CDF70C 0001:0005EDCC C:\condor\bin\condor_master.exe
0045FB45 00CDF728 0001:0005EB45 C:\condor\bin\condor_master.exe
00458951 00CDF758 0001:00057951 C:\condor\bin\condor_master.exe
004589A7 00CDF9F8 0001:000579A7 C:\condor\bin\condor_master.exe
00439747 00CDFE58 0001:00038747 C:\condor\bin\condor_master.exe
The core file the condor_startd creates on failure has the same entries.
The condor log files show nothing unusual.
After some digging I learned about the existence of something called
Data Execution Prevention (system properties -> advanced -> performance
settings -> data execution prevention) which was apparently added to
Windows XP with service pack 2. Adding the condor daemons to the
exception list appears to solve the problem, however I would prefer not
to have to do that.
As far as I can tell, the problem only occurs when condor is configured
to use kerberos authentication. The central manager is, for testing
purposes, also the KDC, using MIT kerberos version krb5-1.6.1. The
execute node runs a fully patched Windows XP SP2 and MIT kerberos for
windows version kfw-3.2.0. Both use condor-6.8.4.
Has anyone encountered this problem? What could be triggering windows'
data execution prevention? How can I avoid adding the condor daemons to
the exception list?
Thanks in advance,
Rob de Graaf