Trying out Condor 7.6.1 -- installed via the rhap.stripped.tar.gz
I get the following in my GAHP log.
06/22/11 09:33:37 Command(AMAZON_VM_STATUS_ALL) got error(code:Client,
msg:End of file or no input: Operation interrupted or timed out
06/22/11 09:38:38 Call to DescribeInstances failed: SOAP 1.1 fault:
SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
06/22/11 09:38:38 Command(AMAZON_VM_STATUS_ALL) got error(code:Client,
msg:End of file or no input: Operation interrupted or timed out
06/22/11 09:42:08 EOF reached on pipe 0
06/22/11 09:42:08 stdin buffer closed, exiting
06/22/11 09:47:19 Call to DescribeInstances failed: SOAP 1.1 fault:
SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
06/22/11 09:47:19 Command(AMAZON_VM_STATUS_ALL) got error(code:Client,
msg:End of file or no input: Operation interrupted or timed out
06/22/11 09:48:33 EOF reached on pipe 0
06/22/11 09:48:33 stdin buffer closed, exiting
06/22/11 09:49:18 Call to DescribeInstances failed: SOAP 1.1 fault:
SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
06/22/11 09:49:18 Command(AMAZON_VM_STATUS_ALL) got error(code:Client,
msg:End of file or no input: Operation interrupted or timed out
The submission file is simple:
universe = grid
grid_resource = amazon https://ec2.amazonaws.com/
periodic_release = NumHolds < 3
+NumHolds = 0
periodic_remove = NumHolds >= 3 || (JobStatus == 2 && time()-ShadowBday
> 1*60*60)
executable = RunEC2VM
amazon_keypair_file = keypair.$(Process)
amazon_ami_id = ami-4ed12d27
amazon_instance_type = m1.large
amazon_user_data = condor:landphil.rocksclusters.org:40000:50000
amazon_private_key = /home/phil/.ec2/pk.pem
amazon_public_key = /home/phil/.ec2/cert.pem
queue 1
And the condor_config_val (The salient ones I think)
$ condor_config_val -dump | grep -i amazon
AMAZON_GAHP = $(SBIN)/amazon_gahp
AMAZON_GAHP_LOG = /tmp/AmazonGahpLog.$(USERNAME)
GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_AMAZON = 20
and
$ condor_config_val -dump | grep -i ssl
SOAP_SSL_CA_FILE = /etc/pki/tls/cert.pem
SOAP_SSL_SKIP_HOST_CHECK = True
I've tried both with an without SOAP_SSL_SKIP_HOST_CHECK.
the SSL_CA_FILE exists
If I try WITHOUT the
SOAP_SSL_CA_FILE = /etc/pki/tls/cert.pem
then I get
Call to DescribeInstances failed: SOAP 1.1 fault: SOAP-ENV:Client [no
subcode]
"SSL_ERROR_SSL
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate
verify failed"
Detail: SSL connect failed in tcp_connect()
Right now I'm flumoxed.
Thanks,
Phil
--
Philip Papadopoulos, PhD
University of California, San Diego
858-822-3628 <tel:858-822-3628> (Ofc)
619-331-2990 <tel:619-331-2990> (Fax)