Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works
- Date: Tue, 23 Aug 2011 08:46:44 -0500
- From: "Timothy St. Clair" <tstclair@xxxxxxxxxx>
- Subject: Re: [Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works
If your VM session exists only to run jobs, have you tried setting your
START expression to TRUE?
You should not need a credd unless you are running as owner, which is
not the default.
Also your CRED_HOST *must be* a windows machine. It may be too early in
the a.m., but I can't discern from the logs below if that is the case.
Cheers,
Tim
On Thu, 2011-08-18 at 19:19 -0400, Jason Herman wrote:
> > hi-
> >
> > Here are the machines i'm setting up:
> >
> > 1) Mac (intel osx) - as condor central server
> > 2) paralles VM running Windows within the mac as execute machine
> > 3) seperate windows desktop
> > 4) after everthing else works: EC2 windows machines - i suppose
> > running as a cluster that attachs as a flock. (perhaps with
> > cyclecomputing)
> >
> > I have tried (for days):
> > * playing with various configurations of condor_config &
> > condor_config.local on both machines.
> > * taken down firewalls on both sides.
> > * read manuals, googled, etc..
> > * running condor_store_cred with various setting on both sides
> >
> > STATUS:
> > So far I have Condor up and running on the MAC as an execute,
> > submit, manage installation. I successfully ran a test job. The
> > windows execute node is up but i can't test it until i get credd
> > security working properly (i think that's the problem). I can see
> > the windows and mac slots from the both sides (see below).
> >
> > When i submit a job from MAC that has windows requirements it
> > doesn't run. Presently, condor_q -analyze says "not yet been
> > considered by the matchmaker" and "match but reject the job for
> > unknown reasons." Under a previously attempted configuration it was
> > "reject your job because of their own requirements" , the Windows
> > slot would got to 'Matched', but the job would be Idle and the logs
> > would suggest a security issue.
> >
> > I can't even condor_rm the Idle jobs on the MAC side. I'm guessing
> > there being matched to Windows ceded their control:
> > ------
> > jimi:~ root# condor_q
> >
> >
> > -- Submitter: jimi.westell.com : <169.254.177.117:49371> :
> > jimi.westell.com
> > ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
> >
> > 11.0 Jason 8/17 22:10 0+01:46:05 I 0 0.0
> > sample-job 60
> > 13.0 Jason 8/18 01:12 0+01:24:43 I 0 0.0
> > sample-job 60
> > 14.0 Jason 8/18 01:24 0+00:02:49 I 0 0.0
> > sample-job 60
> > 15.0 Jason 8/18 01:53 0+00:00:00 I 0 0.0
> > sample-job 60
> >
> > 4 jobs; 4 idle, 0 running, 0 held
> >
> > jimi:~ root# condor_rm 11.0
> > AUTHENTICATE:1003:Failed to authenticate with any method
> > No result found for job 11.0
> > ------
> >
> >
> > CONFIGURATIONS:
> >
> >
> > -------- condor_config.local on MAC:
> > --------
> > CREDD_HOST = 10.211.55.10
> > STARTER_ALLOW_RUNAS_OWNER = True
> > CREDD_CACHE_LOCALLY = True
> > ALLOW_CONFIG = root@$(CONDOR_HOST), *
> > SEC_CONFIG_NEGOTIATION = REQUIRED
> > SEC_CONFIG_AUTHENTICATION = REQUIRED
> > SEC_CONFIG_ENCRYPTION = REQUIRED
> > SEC_CONFIG_INTEGRITY = REQUIRED
> > SEC_PASSWORD_FILE = /usr/local/condor/etc/pool_password
> >
> > -------- condor_config.local on Windows:
> > --------
> > CREDD_HOST = xx.xxx.55.10
> > STARTER_ALLOW_RUNAS_OWNER = True
> > CREDD_CACHE_LOCALLY = True
> > SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
> > ALLOW_CONFIG = *
> > SEC_CONFIG_NEGOTIATION = REQUIRED
> > SEC_CONFIG_AUTHENTICATION = REQUIRED
> > SEC_CONFIG_ENCRYPTION = REQUIRED
> > SEC_CONFIG_INTEGRITY = REQUIRED
> >
> > ------- condor_config on Windows
> > ------- i made this low security just try to get it working:
> > -------
> > ALLOW_WRITE = *
> > ALLOW_READ = *
> > #... not sure what else you need to see
> >
> >
> > LOG FILES:
> >
> > --------- CredLog - on windows
> > --------- this is after turning MAC & WIN firewalls off - not a perm
> > solution, but not working anyway:
> > ---------
> > 08/18/11 14:42:18 Failed to start non-blocking update to
> > <xxx.xxx.1.21:9618>.
> > 08/18/11 14:42:18 Return from Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
> > 0.0000s
> > 08/18/11 14:47:18 Calling Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
> > 08/18/11 14:47:18 Return from Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
> > 0.0000s
> > 08/18/11 14:47:18 Calling Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
> > 08/18/11 14:47:18 SECMAN: required authentication with
> > <xxx.xxx.1.21:9618> failed, so aborting command UPDATE_AD_GENERIC.
> > 08/18/11 14:47:18 ERROR: SECMAN:2004:Failed to create security
> > session to <xxx.xxx.1.21:9618> with TCP.
> > |AUTHENTICATE:1003:Failed to authenticate with any method
> > 08/18/11 14:47:18 Failed to start non-blocking update to
> > <xxx.xxx.1.21:9618>.
> > 08/18/11 14:47:18 Return from Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
> > 0.0000s
> > 08/18/11 14:52:39 attempt to connect to <xxx.xxx.1.21:9618> failed:
> > timed out after 20 seconds.
> > 08/18/11 14:52:39 Calling Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
> > 08/18/11 14:52:39 ERROR: SECMAN:2004:Failed to create security
> > session to <xxx.xxx.1.21:9618> with TCP.
> > |SECMAN:2003:TCP connection to <xxx.xxx.1.21:9618> failed.
> > 08/18/11 14:52:39 Failed to start non-blocking update to
> > <xxx.xxx.1.21:9618>.
> > 08/18/11 14:52:39 Return from Handler
> > <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
> > 0.0000s
> >
> > --------- MasterLog - on windows
> > ---------
> > ---------
> > 08/18/11 14:51:50 condor_read(): timeout reading 21 bytes from
> > <10.211.55.10:53043>.
> > 08/18/11 14:51:50 IO: Failed to read packet header
> > 08/18/11 14:51:50 store_pool_cred: failed to receive all parameters
> >
> >
> > COMMAND LINE OUTPUT:
> >
> > ---------- condor_status - on windows
> > ---------- Manual says to run this when you are done, doesn't
> > mention the command
> > ---------- only works on the windows side:
> > C:\Users\Administrator>condor_status -f "%s\t" Name -f "%s\n"
> > ifThenElse(isUndefined(LocalCredd),\"UNDEF"\",LocalCredd)
> > slot1@JASONHERMANB752 UNDEF
> > slot1@xxxxxxxxxxxxxxxx UNDEF
> > slot2@JASONHERMANB752 UNDEF
> > slot2@xxxxxxxxxxxxxxxx UNDEF
> > slot3@xxxxxxxxxxxxxxxx UNDEF
> > slot4@xxxxxxxxxxxxxxxx UNDEF
> > slot5@xxxxxxxxxxxxxxxx UNDEF
> > slot6@xxxxxxxxxxxxxxxx UNDEF
> > slot7@xxxxxxxxxxxxxxxx UNDEF
> > slot8@xxxxxxxxxxxxxxxx UNDEF
> >
> >
> > ------- condor_status - MAC (identical on windows)
> > -------
> > -------
> > jimi:log root# condor_status
> >
> > Name OpSys Arch State Activity LoadAv Mem
> > ActvtyTime
> >
> > slot1@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.210 1024
> > 0+19:09:01
> > slot2@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 1+11:24:12
> > slot3@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 1+03:18:37
> > slot4@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 0+23:14:03
> > slot5@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 0+15:05:52
> > slot6@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 0+11:04:54
> > slot7@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 0+06:59:54
> > slot8@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
> > 1+15:27:42
> > slot1@JASONHERMANB WINNT60 INTEL Unclaimed Idle 0.120 1023
> > 0+00:00:04
> > slot2@JASONHERMANB WINNT60 INTEL Unclaimed Idle 0.100 1023
> > 0+00:00:02
> > Total Owner Claimed Unclaimed Matched Preempting
> > Backfill
> >
> > INTEL/WINNT60 2 0 0 2 0
> > 0 0
> > X86_64/OSX 8 0 0 8 0
> > 0 0
> >
> > Total 10 0 0 10 0
> > 0 0
> >
> >
> > -------- condor_store_cred on Windows:
> > --------
> > --------
> > C:\Users\Administrator>condor_store_cred -c add
> > Account: condor_pool@JASONHERMANB752
> >
> > Enter password:
> >
> > Operation failed.
> > Make sure you have CONFIG access to the target Master.
> >
> >
> > thanks kindly for any assistance, jason
> >
> >
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/