Hi Zach! Thanks for that explanation, I was obviously looking in the wrong direction there. I changed the permissions like you said, but I am still getting the same error from my collector about reading the file. Is there a debug setting I can add to get it to tell me more about why the read_secure_file call failed? -Wes Wesley Taylor â Cluster Manager Numerica Corporation (www.numerica.us) 5042 Technology Parkway #100 Fort Collins, Colorado 80528 âï (970) 207 2232 ð wesley.taylor@xxxxxxxxxxx -----Original Message----- From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Zach Miller Sent: Monday, June 22, 2020 10:02 PM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: [External] - Re: [HTCondor-users] Debugging HTCondor Authentication Errors CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi Wes, (First, no problem sending email any time you have a question!) You wrote: > It looks like for some reason condor can't read the POOL file, even > though the file (and its parent directory) are owned by the user and > group condor:condor, and everyone has execute permissions on /etc and > /etc/condor. I also made selinux permissive just in case that was the > issue. Clearly the error message here should be better. When HTCondor says, "read_secure_file(/etc/condor/password.d/POOL) failed!", in this case it's not because it couldn't read the file but that the file was TOO permissive. The file for PASSWORD authentication should be chmod 600 and owned by root:root. Try fixing that and let us know if that did the trick. Cheers, -zach ïOn 6/22/20, 7:11 PM, "HTCondor-users on behalf of wesley.taylor@xxxxxxxxxxx" <htcondor-users-bounces@xxxxxxxxxxx on behalf of wesley.taylor@xxxxxxxxxxx> wrote: Hello, I feel a little bad for emailing everyone twice in the same day, but I am still getting familiar with HTCondor. I tested a minicondor last week and was having a ball of a time, but now I am looking to scale up and have hit some hiccups. I am following the "Setting up an HTCondor Pool" (https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fhtcondor.readthedocs.io%2Fen%2Flatest%2Fadmin-manual%2Fquick-start-condor-pool.html&data=02%7C01%7C%7Cd5af8b02ce95464c379b08d8172a6683%7Cfae7a2aedf1d444e91bebabb0900b9c2%7C0%7C0%7C637284818176181330&sdata=B%2FJHjCnUFygzjlYH4V508laOZWm%2F18K1nQbIQNRcwtc%3D&reserved=0) but with some minor modifications to try and make the test setup better match the production system's architecture. I simply changed the configuration so one machine had the roles of "Submit" and "Central Manager" and I have two "Execute" machines located on the same network. I went through the guide, but when I started everything up they weren't authenticating with one another. On both the "Execute" machines I am getting the following from my StartLog (I set STARTD_DEBUG = D_SECURITY:2 in my config): __________________________________________________________________________________________________________________________________________________________________________________________________ 06/22/20 17:33:14 SECMAN: new session, doing initial authentication. 06/22/20 17:33:14 SECMAN: authenticating RIGHT NOW. 06/22/20 17:33:14 SECMAN: AuthMethodsList: PASSWORD 06/22/20 17:33:14 SECMAN: Auth methods: PASSWORD 06/22/20 17:33:14 AUTHENTICATE: setting timeout for <192.168.0.69:9618> to 20. 06/22/20 17:33:14 AUTHENTICATE: in authenticate( addr == '<192.168.0.69:9618>', methods == 'PASSWORD') 06/22/20 17:33:14 AUTHENTICATE: can still try these methods: PASSWORD 06/22/20 17:33:14 HANDSHAKE: in handshake(my_methods = 'PASSWORD') 06/22/20 17:33:14 HANDSHAKE: handshake() - i am the client 06/22/20 17:33:14 HANDSHAKE: sending (methods == 512) to server 06/22/20 17:33:14 HANDSHAKE: server replied (method = 512) 06/22/20 17:33:14 AUTHENTICATE: will try to use 512 (PASSWORD) 06/22/20 17:33:14 AUTHENTICATE: do_authenticate is 1. 06/22/20 17:33:14 PW. 06/22/20 17:33:14 PW: getting name. 06/22/20 17:33:14 PW: Generating ra. 06/22/20 17:33:14 PW: Client sending. 06/22/20 17:33:14 Client sending: 0, 19(condor_pool@worker1), 256 06/22/20 17:33:14 PW: Client receiving. 06/22/20 17:33:14 Server sent status indicating not OK. 06/22/20 17:33:14 PW: Client received ERROR from server, propagating 06/22/20 17:33:14 PW: CLient sending two. 06/22/20 17:33:14 In client_send_two. 06/22/20 17:33:14 Can't send null for random string. 06/22/20 17:33:14 Client sending: 0() 0 0 06/22/20 17:33:14 Sent ok. 06/22/20 17:33:14 AUTHENTICATE: method 512 (PASSWORD) failed. 06/22/20 17:33:14 AUTHENTICATE: can still try these methods: 06/22/20 17:33:14 HANDSHAKE: in handshake(my_methods = '') 06/22/20 17:33:14 HANDSHAKE: handshake() - i am the client 06/22/20 17:33:14 HANDSHAKE: sending (methods == 0) to server 06/22/20 17:33:14 HANDSHAKE: server replied (method = 0) 06/22/20 17:33:14 AUTHENTICATE: no available authentication methods succeeded! 06/22/20 17:33:14 SECMAN: required authentication with collector 192.168.0.69 failed, so aborting command DC_START_TOKEN_REQUEST. 06/22/20 17:33:14 Failed to request a new token: DAEMON:1:failed to start command for token request with remote daemon at '<192.168.0.69:9618>'.|AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using PASSWORD __________________________________________________________________________________________________________________________________________________________________________________________________ So then I went and looked at the CollectorLog on the Manager: __________________________________________________________________________________________________________________________________________________________________________________________________ 06/22/20 18:03:15 DC_AUTHENTICATE: required authentication of 192.168.0.70 failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using PASSWORD 06/22/20 18:03:15 read_password_from_filename(): read_secure_file(/etc/condor/password.d/POOL) failed! 06/22/20 18:03:15 read_password_from_filename(): read_secure_file(/etc/condor/password.d/POOL) failed! ___________________________________________________________________________________________________________________________________________________________________________________________________ (Don't pay attention to the fact the timestamps are really far apart, I have just been trying some more things in the past little bit) It looks like for some reason condor can't read the POOL file, even though the file (and its parent directory) are owned by the user and group condor:condor, and everyone has execute permissions on /etc and /etc/condor. I also made selinux permissive just in case that was the issue. Does anyone have any further steps I can take to figure out why this read is failing? Thank you! -Wes Wesley Taylor â Cluster Manager Numerica Corporation (https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.numerica.us%2F&data=02%7C01%7C%7Cd5af8b02ce95464c379b08d8172a6683%7Cfae7a2aedf1d444e91bebabb0900b9c2%7C0%7C0%7C637284818176181330&sdata=p30fkOUrbz%2FstJJmHFT0FcGA9hav18z6ed27JHpcQkQ%3D&reserved=0) 5042 Technology Parkway #100 Fort Collins, Colorado 80528 âï (970) 207 2232 ð wesley.taylor@xxxxxxxxxxx _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Flists.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users&data=02%7C01%7C%7Cd5af8b02ce95464c379b08d8172a6683%7Cfae7a2aedf1d444e91bebabb0900b9c2%7C0%7C0%7C637284818176191327&sdata=7pQHvYER2r874KYlygo5mQyt1IPxcyuMW09QcYtX0yw%3D&reserved=0 The archives can be found at: https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Flists.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F&data=02%7C01%7C%7Cd5af8b02ce95464c379b08d8172a6683%7Cfae7a2aedf1d444e91bebabb0900b9c2%7C0%7C0%7C637284818176191327&sdata=mamEaLoLl4hgR6FOYMD%2BB0vIeCZ6Gw35sXaTotwDqNQ%3D&reserved=0
Attachment:
smime.p7s
Description: S/MIME cryptographic signature