Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Inconsistent output of "condor_q -glo"?
- Date: Thu, 18 Nov 2021 09:50:23 +0100
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: [HTCondor-users] Inconsistent output of "condor_q -glo"?
Good morning,
after a major reconfig of our Hypatia cluster, with a couple of jobs having
been held before, I'm now getting somewhat inconsistent output from condor_q:
root@condormaster:.# condor_status -schedd
Name Machine RunningJobs IdleJobs HeldJobs
hypatia1.hypatia.local@xxxxxxxxxxxxxxxxxx hypatia1.my.domain 0 0 0
hypatia2.hypatia.local@xxxxxxxxxxxxxxxxxx hypatia2.my.domain 0 0 183
hypatia3.hypatia.local@xxxxxxxxxxxxxxxxxx hypatia3.my.domain 0 0 0
TotalRunningJobs TotalIdleJobs TotalHeldJobs
Total 0 0 183
root@condormaster:.# condor_q -schedd hypatia1.my.domain
All queues are empty
root@condormaster:.# condor_q -schedd hypatia2.my.domain
All queues are empty
root@condormaster:.# condor_q -schedd hypatia3.my.domain
All queues are empty
(same if I use "hypatia*.hypatia.local")
root@condormaster:.# condor_q -glo
-- Failed to fetch ads from: <10.150.100.102:4597?addrs=10.150.100.102-4597&alias=hypatia2.my.domain> : hypatia2.my.domain
AUTHENTICATE:1003:Failed to authenticate with any method
AUTHENTICATE:1004:Failed to authenticate using FS
root@condormaster:.#
I have compared the output of "condor_config_val -dump" for hypatia1 and hypatia2,
and see no difference (except the few machine-/IP-specific lines).
What's behind those AUTHENTICATE:100{3,4} failures?
In the ScheddLog, I see
DC_AUTHENTICATE: reason for authentication failure: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to lstat(/tmp/FS_XXXvkEMCP)
Since /tmp has permissions 1777, what causes the lstat() error?
Why does this only happen on one of three submit nodes?
# condor_version
$CondorVersion: 9.0.7 Nov 03 2021 BuildID: Debian-9.0.7-1+deb10u0 PackageID: 9.0.7-1+deb10u0 Debian-9.0.7-1+deb10u0 $
$CondorPlatform: X86_64-Debian_10 $
Thanks,
Steffen
--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)my.domain
~~~