Subject: Re: [HTCondor-users] Shared port daemon fails to start after reboot on Dual Stack CentOS7 nodes
'heplnc001.pp.rl.ac.uk'ÎThat is what I read, too. Although it also leads me to believe that the default behavior when you enable either network services is to also enable the wait-online.service. Either way, it should be checked. Chris: the "cut the xxxx!" section is to the point.
That said, theÂfirst line reads to me like it's trying to find the IPV6 address of the host the master is running on. You could/should consider adding an IPV6 entry to /etc/hosts. You probably have an IPV4 entry.
Tom
On Wed, Jun 9, 2021 at 10:26 AM Brian Lin <blin@xxxxxxxxxxx> wrote:
It looks like network-online.target should be sufficient by itself
in After/Wants but requires the correct *-wait-online.service to be
enabled, based on the service that is used to manage the host's
network (i.e. NetworkManager, systemd-networkd, etc.).
I don't know if it goes all the way back to CentOS 7. I
am also less familiarÂwith CentOS generally and whether the
NetworkManager way works with this. This is what you'd do in
Debian.
On Wed, Jun 9, 2021 at
9:23 AM Brian Lin <blin@xxxxxxxxxxx> wrote:
If the culprit is DNS, you could try adding
nss-lookup.target [1] to the condor unit's "After" list
- Brian
[1] from the systemd.special man page:
ÂÂÂÂÂÂ nss-lookup.target
ÂÂÂÂÂÂÂÂÂÂ A target that should be used as
synchronization point for all host/network name service
lookups. Note that this is independent of UNIX
user/group name lookups for which
ÂÂÂÂÂÂÂÂÂÂ nss-user-lookup.target should be used. All
services for which the availability of full host/network
name resolution is essential should be ordered after
this target, but not
ÂÂÂÂÂÂÂÂÂÂ pull it in. systemd automatically adds
dependencies of type After= for this target unit to all
SysV init script service units with an LSB header
referring to the "$named"
ÂÂÂÂÂÂÂÂÂÂ facility.
On 6/9/21 9:09 AM, Thomas Hartmann wrote:
Hi Chris,
you could probably list all you services with
 systemd-analyze plot > service_chain.svg
and check, if there is a unit after networking etc.
you could attach to as dependency for condor.service
Unfortunately, on a quick check here, the condor unit
is already one of the last units to be started
(needing a lot of scrolling)
Cheers,
 Thomas
On 09/06/2021 11.24, Chris Brew - STFC UKRI wrote:
Thanks All,
I feared that might be the case, before I go off and
add `ExecStartPre=/bin/sleep 30` to the condor
systemd service file (dirty), or add systemd
`.timer` file to control the start up (slightly less
dirty), does anyone have an idea of a late stage
networking service I can add a dependency too
(cleaner).
Â> If I run `sudo systemctl restart condor` after
I can log into the node
Â> everything comes up cleanly so Iâm wondering
if the Master is coming up
Â> before something that it needs.
ÂÂÂÂÂÂÂÂÂ It's typical, despite our best efforts in
the packaging, for
various bits and pieces of the network, particularly
including DNS/DHCP,
not to be up when the condor master starts.
- ToddM
This email and any attachments are intended solely
for the use of the named recipients. If you are not
the intended recipient you must not use, disclose,
copy or distribute this email or any of its
attachments and should notify the sender immediately
and delete this email from your system. UK Research
and Innovation (UKRI) has taken every reasonable
precaution to minimise risk of this email or any
attachments containing viruses or malware but the
recipient should carry out its own virus and malware
checks before opening the attachments. UKRI does not
accept any liability for any losses or damages which
the recipient may sustain due to presence of any
viruses. Opinions, conclusions or other information
in this message and attachments that are not related
directly to UKRI business are solely those of the
author and do not represent the views of UKRI.