[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 7.4.2 on Windows - condor-reuse-slot1 logon problems



Thanks for responses. The StarterLog shows the same error on both an XP and Vista machine, e.g.
(I have blanked out some of the ip addresses)

05/10 09:53:33 Using config source: C:\condor\condor_config
05/10 09:53:33 Using local config sources: 
05/10 09:53:33    C:\condor/condor_config.local
05/10 09:53:33 DaemonCore: Command Socket at <155.*.*.*:1977>
05/10 09:53:33 GLEXEC_JOB not supported on this platform; ignoring
05/10 09:53:33 Setting resource limits not implemented!
05/10 09:53:33 Communicating with shadow <155.*.*.*:1610>
05/10 09:53:33 Submitting machine is "cs**.essex.ac.uk"
05/10 09:53:33 setting the orig job name in starter
05/10 09:53:33 setting the orig job iwd in starter
05/10 09:53:34 LogonUser(condor-reuse-slot1, ... ) failed with status 1385
05/10 09:53:34 ERROR "Failed to create a user nobody" at line 450 in file ..\src\condor_c++_util\uids.cpp
05/10 09:53:34 ERROR "LocalUserLog::logStarterError(Failed to create a user nobody) called before init()" at line 222 in file ..\src\condor_starter.V6.1\local_user_log.cpp

I can recheck, but I think this was an error on the older Condor version too. 

It has been many months since Condor began failing, and I was under the impression it was the move to Vista. This is apparently not the case, as even an XP build is now failing miserably. 
Condor is a Local System Account, and condor-reuse-slot1 is not locked out.

Kevan


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: 07 May 2010 15:44
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor 7.4.2 on Windows - condor-reuse-slot1 logon problems

Timothy St. Clair wrote:
> inline
> 
> On Wed, 2010-05-05 at 16:09 +0100, Wilding, Kevan A wrote:
>> Dear all,
>>
>>                   I am struggling to get a Condor pool running using a
>> variety of Windows machines. A lot of the recent postings have cleared
>> up many of the problems.
>>
>> As regards the latest Vista build, and version 7.4.2 installed, I am
>> getting condor-reuse-slot1 logon errors. This was also very similar on
>> XP. A simple Hello java program will submit to the pool, and sit idle,
>> even though it is matched to a machine in the pool. Eventually I
>> remove the job, and go through all Logs on Master & Local machines,
>> and also check the Event Viewer. 
>>
> 
> 2 ?'s:
> 1.) Are the machines idle?
> 
> 2.) What is the result of:
> condor_config_val START
> condor_config_val SUSPEND
> 
> for testing purposes *only* you may want to set START=TRUE and
> SUSPEND=FALSE and see if your jobs run.  If they do, then the kbdd is
> suspect. 
> 

What makes you think the kbdd has anything to do w/ this?

I think Kevan was on the right track, and the big clue here is the log 
below which implies that the condor_starter failed to login to the 
condor-reuse-slot1 account.

When Condor attempts to start a job on Windows, if it is not configured 
to run the job as the submitting user, it will run the job as a "condor 
slot account", e.g. condor-reuse-slot1.  If the account does not exist, 
the condor_starter will create it and place it into the USERS group. If 
the account does exist, it will enable it. Next the starter will assign 
a new randomly generated password to the account, and then use that 
password to login so it can run the job as condor-reuse-slot1 (instead 
of running the job as local system!) - the event log snippets below 
imply that some policy is preventing this login for succeeding. When the 
job completes, the starter will disable the account condor-reuse-slot1.

One guess is there is some policy pushed out to your vista boxes 
preventing the login.  A quick google search found several possibilities 
and solutions, such as
http://www.eventid.net/display.asp?eventid=534&eventno=10&source=Security&phase=1

Another guess - are you running the Condor service as local SYSTEM (the 
default when using the Condor MSI installer), and not as some domain user ?

Another thought - is the StarterLog on the execute machine reporting 
(maybe helpful) errors when attempting to start the job?

Another thought - take a peek on one of your execute nodes, and examine 
the local users using the Computer Management console.  Is there a user 
"condor-reuse-slot1" on the system?  Look at the properties of this 
account.  It is ok if the account is disabled (see above), but 
definitely NOT ok if the account is locked out.  Is the account a member 
of group USERS?

hope the above helps,
Todd



> Cheers,
> Tim
> 
>>                This has been a problem for some months now, and even
>> after re-installing the various Condor versions, on a variety of
>> Windows builds, a pool never appears to work. 
>>
>>                This seems to give the biggest clues,  e.g.
>>
>>  
>>
>>              Any clues are very welcome.
>>
>>  
>>
>> Thanks
>>
>> Kevan
>>
>>  
>>
>>  
>>
>> An account failed to log on.
>>
>>  
>>
>> Subject:
>>
>>                 Security ID:                         SYSTEM
>>
>>                 Account Name:                 CSEELAB151$
>>
>>                 Account Domain:                             CAMPUS
>>
>>                 Logon ID:                             0x3e7
>>
>>  
>>
>> Logon Type:                                       2
>>
>>  
>>
>> Account For Which Logon Failed:
>>
>>                 Security ID:                         NULL SID
>>
>>                 Account Name:                 condor-reuse-slot1
>>
>>                 Account Domain:                             cseelab151
>>
>>  
>>
>> Failure Information:
>>
>>                 Failure Reason:                 The user has not been
>> granted the requested logon type at this machine.
>>
>>                 Status:                                  0xc000015b
>>
>>                 Sub Status:                         0x0
>>
>>  
>>
>> Process Information:
>>
>>                 Caller Process ID:             0xd58
>>
>>                 Caller Process Name:     C:\condor\bin
>> \condor_starter.exe
>>
>>  
>>
>> Network Information:
>>
>>                 Workstation Name:        CSEELAB151
>>
>>                 Source Network Address:            -
>>
>>                 Source Port:                       -
>>
>>  
>>
>> Detailed Authentication Information:
>>
>>                 Logon Process:                  Advapi  
>>
>>                 Authentication Package:               Negotiate
>>
>>                 Transited Services:          -
>>
>>                 Package Name (NTLM only):       -
>>
>>                 Key Length:                        0
>>
>>  
>>
>> This event is generated when a logon request fails. It is generated on
>> the computer where access was attempted.
>>
>>  
>>
>> The Subject fields indicate the account on the local system which
>> requested the logon. This is most commonly a service such as the
>> Server service, or a local process such as Winlogon.exe or
>> Services.exe.
>>
>>  
>>
>> The Logon Type field indicates the kind of logon that was requested.
>> The most common types are 2 (interactive) and 3 (network).
>>
>>  
>>
>> The Process Information fields indicate which account and process on
>> the system requested the logon.
>>
>>  
>>
>> The Network Information fields indicate where a remote logon request
>> originated. Workstation name is not always available and may be left
>> blank in some cases.
>>
>>  
>>
>> The authentication information fields provide detailed information
>> about this specific logon request.
>>
>>                 - Transited services indicate which intermediate
>> services have participated in this logon request.
>>
>>                 - Package name indicates which sub-protocol was used
>> among the NTLM protocols.
>>
>>                 - Key length indicates the length of the generated
>> session key. This will be 0 if no session key was requested.
>>
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/


-- 
Todd Tannenbaum                       University of Wisconsin-Madison
Center for High Throughput Computing  Department of Computer Sciences
tannenba@xxxxxxxxxxx                  1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                 Madison, WI 53706-1685

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/