Hi Mary,
that could work but it is probably not the most elegant solution. What you rather want to do is alter the START _expression_ and put in a host-class-add-condition like 'Start_Jobs' that you can then make remote controllable.
We do add an additional second condition that is the output of a local script triggered by startdcron that checks for health of the machine, like uptime, diskspace, nfs mounts etc.
The start expresion then would look like this:
START = (StartJobs =?= True) && (NODE_IS_HEALTHY =?= true)
At install time you set StartJobs to false - hence the machine is not gonna start any jobs untill someone alters 'StartJobs' to true, to do this remotely you need to enable the remote altering of the classadd:
STARTD.SETTABLE_ATTRS_ADMINISTRATOR = StartJobs
STARTD_ATTRS = StartJobs
Will do the trick. You can then use
condor_config_val -startd -name <host> "StartJobs = true"
condor_reconfigure -startd <host>
To enable the host to run jobs and of course you can also set it back to false the same way.
In our case we run a small script periodically that checks for some minor things and either outputs
NODE_IS_HEALTHY = true
or
NODE_IS_HEALTHY = false
In the second case the node will not start any jobs either as it considers itself as not fit....
Here is the code on the workernode you need to do so in addition to the START _expression_ above:
STARTD_CRON_JOBLIST = NODEHEALTH
STARTD_CRON_NODEHEALTH_EXECUTABLE = /etc/condor/tests/healthcheck_wn_condor.sh
STARTD_CRON_NODEHEALTH_PERIOD = 180s
STARTD_CRON_NODEHEALTH_MODE = Periodic
Hope this helps
Best
Christoph
--
Christoph Beyer
DESY Hamburg
IT-Department
Notkestr. 85
Building 02b, Room 009
22607 Hamburg
phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
Von: "Mary Romelfanger" <mary@xxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Montag, 6. Mai 2019 21:07:26
Betreff: Re: [HTCondor-users] start just master at boot time
I think I found the knob.
Everything is in the manual.. the trick is finding it.
START_DAEMONS = False
Is this what I am looking for?
Mary
From: Mary Romelfanger <mary@xxxxxxxxx>
Date: Monday, May 6, 2019 at 2:49 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: start just master at boot time
Hi Everyone,
Is there a way to start JUST the master daemon at boot time, without it then starting up the rest of the processes in DAEMON_LIST?
We have an operational HTCondor pool that the operators want to have full control of, so we do NOT want it to start processing at boot time. We have to check and make sure some other processes are running
(and maybe do any cleanup from any problems) before we start data processing in this HTCondor pool.
We have been just not starting at boot time, and then doing systemd full starts on each machine when ready, but this pool is about to grow and we would like the operators to be able to use the condor_on command
to get the startd processes running across the entire pool with one command when ready, but this requires that the master daemon is already running on each machineâ. How do we get the master daemon up without it starting the rest of the DAEMON_LIST processes
immediately?
$CondorVersion: 8.8.2 Apr 11 2019 BuildID: 465890 PackageID: 8.8.2-1 $
$CondorPlatform: x86_64_RedHat7 $
Mary
Mary Romelfanger
Deputy Branch Manager
Data Systems Branch
.___.
{o,o} Phone 410-338-6708
/)__) Cell 443-244-0191
-"-"- mary@xxxxxxxxx
Space Telescope Science Institute
3700 San Martin Drive
Baltimore, MD 21218
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/