If your VM session exists only to run jobs, have you tried setting your
START _expression_ to TRUE?
You should not need a credd unless you are running as owner, which is
not the default.
Also your CRED_HOST *must be* a windows machine. It may be too early in
the a.m., but I can't discern from the logs below if that is the case.
Cheers,
Tim
On Thu, 2011-08-18 at 19:19 -0400, Jason Herman wrote:
hi-
Here are the machines i'm setting up:
1) Mac (intel osx) - as condor central server
2) paralles VM running Windows within the mac as execute machine
3) seperate windows desktop
4) after everthing else works: EC2 windows machines - i suppose
running as a cluster that attachs as a flock. (perhaps with
cyclecomputing)
I have tried (for days):
* playing with various configurations of condor_config &
condor_config.local on both machines.
* taken down firewalls on both sides.
* read manuals, googled, etc..
* running condor_store_cred with various setting on both sides
STATUS:
So far I have Condor up and running on the MAC as an execute,
submit, manage installation. I successfully ran a test job. The
windows execute node is up but i can't test it until i get credd
security working properly (i think that's the problem). I can see
the windows and mac slots from the both sides (see below).
When i submit a job from MAC that has windows requirements it
doesn't run. Presently, condor_q -analyze says "not yet been
considered by the matchmaker" and "match but reject the job for
unknown reasons." Under a previously attempted configuration it was
"reject your job because of their own requirements" , the Windows
slot would got to 'Matched', but the job would be Idle and the logs
would suggest a security issue.
I can't even condor_rm the Idle jobs on the MAC side. I'm guessing
there being matched to Windows ceded their control:
------
jimi:~ root# condor_q
-- Submitter: jimi.westell.com : <169.254.177.117:49371> :
jimi.westell.com
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
11.0 Jason 8/17 22:10 0+01:46:05 I 0 0.0
sample-job 60
13.0 Jason 8/18 01:12 0+01:24:43 I 0 0.0
sample-job 60
14.0 Jason 8/18 01:24 0+00:02:49 I 0 0.0
sample-job 60
15.0 Jason 8/18 01:53 0+00:00:00 I 0 0.0
sample-job 60
4 jobs; 4 idle, 0 running, 0 held
jimi:~ root# condor_rm 11.0
AUTHENTICATE:1003:Failed to authenticate with any method
No result found for job 11.0
------
CONFIGURATIONS:
-------- condor_config.local on MAC:
--------
CREDD_HOST = 10.211.55.10
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
ALLOW_CONFIG = root@$(CONDOR_HOST), *
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED
SEC_PASSWORD_FILE = /usr/local/condor/etc/pool_password
-------- condor_config.local on Windows:
--------
CREDD_HOST = xx.xxx.55.10
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
ALLOW_CONFIG = *
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED
------- condor_config on Windows
------- i made this low security just try to get it working:
-------
ALLOW_WRITE = *
ALLOW_READ = *
#... not sure what else you need to see
LOG FILES:
--------- CredLog - on windows
--------- this is after turning MAC & WIN firewalls off - not a perm
solution, but not working anyway:
---------
08/18/11 14:42:18 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:42:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:47:18 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:47:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:47:18 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:47:18 SECMAN: required authentication with
<xxx.xxx.1.21:9618> failed, so aborting command UPDATE_AD_GENERIC.
08/18/11 14:47:18 ERROR: SECMAN:2004:Failed to create security
session to <xxx.xxx.1.21:9618> with TCP.
|AUTHENTICATE:1003:Failed to authenticate with any method
08/18/11 14:47:18 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:47:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:52:39 attempt to connect to <xxx.xxx.1.21:9618> failed:
timed out after 20 seconds.
08/18/11 14:52:39 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:52:39 ERROR: SECMAN:2004:Failed to create security
session to <xxx.xxx.1.21:9618> with TCP.
|SECMAN:2003:TCP connection to <xxx.xxx.1.21:9618> failed.
08/18/11 14:52:39 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:52:39 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
--------- MasterLog - on windows
---------
---------
08/18/11 14:51:50 condor_read(): timeout reading 21 bytes from
<10.211.55.10:53043>.
08/18/11 14:51:50 IO: Failed to read packet header
08/18/11 14:51:50 store_pool_cred: failed to receive all parameters
COMMAND LINE OUTPUT:
---------- condor_status - on windows
---------- Manual says to run this when you are done, doesn't
mention the command
---------- only works on the windows side:
C:\Users\Administrator>condor_status -f "%s\t" Name -f "%s\n"
ifThenElse(isUndefined(LocalCredd),\"UNDEF"\",LocalCredd)
slot1@JASONHERMANB752 UNDEF
slot1@xxxxxxxxxxxxxxxx UNDEF
slot2@JASONHERMANB752 UNDEF
slot2@xxxxxxxxxxxxxxxx UNDEF
slot3@xxxxxxxxxxxxxxxx UNDEF
slot4@xxxxxxxxxxxxxxxx UNDEF
slot5@xxxxxxxxxxxxxxxx UNDEF
slot6@xxxxxxxxxxxxxxxx UNDEF
slot7@xxxxxxxxxxxxxxxx UNDEF
slot8@xxxxxxxxxxxxxxxx UNDEF
------- condor_status - MAC (identical on windows)
-------
-------
jimi:log root# condor_status
Name OpSys Arch State Activity LoadAv Mem
ActvtyTime
slot1@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.210 1024
0+19:09:01
slot2@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
1+11:24:12
slot3@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
1+03:18:37
slot4@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
0+23:14:03
slot5@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
0+15:05:52
slot6@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
0+11:04:54
slot7@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
0+06:59:54
slot8@xxxxxxxxxxxx OSX X86_64 Unclaimed Idle 0.000 1024
1+15:27:42
slot1@JASONHERMANB WINNT60 INTEL Unclaimed Idle 0.120 1023
0+00:00:04
slot2@JASONHERMANB WINNT60 INTEL Unclaimed Idle 0.100 1023
0+00:00:02
Total Owner Claimed Unclaimed Matched Preempting
Backfill
INTEL/WINNT60 2 0 0 2 0
0 0
X86_64/OSX 8 0 0 8 0
0 0
Total 10 0 0 10 0
0 0
-------- condor_store_cred on Windows:
--------
--------
C:\Users\Administrator>condor_store_cred -c add
Account: condor_pool@JASONHERMANB752
Enter password:
Operation failed.
Make sure you have CONFIG access to the target Master.
thanks kindly for any assistance, jason
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/