On 05/20/2013 02:53 PM, Brian Candler
wrote:
On Mon, May 20, 2013 at 02:38:31PM -0400, Dan Shea wrote:Adding STARTD to the gatekeeper node caused all jobs queued to be executed on the gatekeeper. It seems the gatekeeper machine can not see the execute-only nodes? I'm not sure what I have missed in the configuration to cause this behaviour? Network wise they all see each other just fine, hostnames resolved via /etc/hosts entries.Have you set ALLOW_WRITE, if so to what? Currently, I am attempting to limit things to the local network, perhaps this is not the correct way to wildcard a subnet? ALLOW_WRITE = 10.11.114.* SchedLog:05/17/13 13:41:21 (pid:9037) WARNING: forward resolution of localhost.localdomain doesn't match 10.11.114.220!This does look like a problem. What does "hostname" show on all the nodes? Do you have a "localhost.localdomain" entry in /etc/hosts? Normally it would be for 127.0.0.1, don't be tempted to set it to the external IP of your machine. hostname will return node00 - node09 depending upon which node you are on. /etc/hosts localhost.localdomain entry has not been modified, it still points to loopback. I think I do see the issue however. 127.0.0.1 node00 localhost localhost.localdomain 10.11.114.220 node00 Thanks Brian, let me correct the /etc/hosts entries and see if it fixes things a bit. Regards, Dan _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/ -- Dan Shea - daniel_shea2@xxxxxxxxxxxxxxx Senior Systems Administrator, West Quad Computing Group Harvard Medical School "Charlie was a chemist, But Charlie is no more. For what he thought was H2O, Was H2SO4." |