Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Job fails to run / Job leaves around unkillable processes
- Date: Fri, 29 Oct 2010 07:31:41 -0700
- From: Torrin Jones <tjones.job@xxxxxxxxx>
- Subject: [Condor-users] Job fails to run / Job leaves around unkillable processes
Using Condor 7.4.4 on Windows XP.
Any idea what would cause an error 267?
From StarterLog.slot1 . . .
10/28 08:35:33 Create_Process: CreateProcess failed, errno=267
10/28 08:35:33 ERROR "Create_Process(C:\condor\execute\dir_6136\condor_exec.exe,, ...) failed: " at line 530 in file ..\src\condor_starter.V6.1\os_proc.cpp
The MSDN says 267 means, "The directory name is invalid." However, the directory name is there. Here is the scenario. I submit a small job. condor_dummy.job attached. All condor_dummy.exe does is print out a line like this . . .
Run by DOMAIN\USER on COMPUTERNAME at DATE TIME.
It's basically a quick condor test.
Anyway, I submit the job and condor tries to run it. However it fails and I get the above message in the StarterLog.slot1. Here is the kicker. It will retry and fail. However, if I leave it in the queue long enough, it will eventually succeed. When I ran the job yesterday, it tried 28 times. The final time, it succeeded. Here is another thing I'm seeing. After it succeeded, I looked in Process Explorer and saw 27 condor_exec.exe running. The condor_exec.exe's were unkillable. I tried every approach I could think of. Killing them as Admin, as NT AUTHORITY/SYSTEM, even putting a debugger on them and killing them that way, nothing works.
So I have 2 issues.
1. The job fails to run.
2. The job leaves around unkillable processes.
Any ideas? Has anybody seen anything like this?
Attachment:
condor_dummy.job
Description: Binary data