Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] does condor_off -peaceful -daemon startd node; works for vanilla jobs?
- Date: Thu, 18 Aug 2016 11:29:04 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] does condor_off -peaceful -daemon startd node; works for vanilla jobs?
As another data point, it also seemed to work for me running a
pre-release of HTCondor v8.5.7 on Scientific Linux 6.8.
Behold the simple test below; note the node went from Claimed/Busy to
Claimed/Retiring, which is expected. "Retiring" activity is defined in
the Manual (from https://is.gd/mi7mVk ):
Retiring
When an active claim is about to be preempted for any reason, it enters retirement,
while it waits for the current job to finish. The MaxJobRetirementTime expression determines
how long to wait (counting since the time the job started). Once the job finishes or the
retirement time expires, the Preempting state is entered.
Perhaps you have a MaxJobRetirementTime defined ?
regards,
Todd
[tannenba@localhost test]$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@localhost LINUX X86_64 Claimed Busy 0.000 330 0+00:00:04
slot2@localhost LINUX X86_64 Unclaimed Idle 0.000 330 0+00:00:05
slot3@localhost LINUX X86_64 Unclaimed Idle 0.000 330 0+00:00:06
Total Owner Claimed Unclaimed Matched Preempting Backfill Drain
X86_64/LINUX 3 0 1 2 0 0 0 0
Total 3 0 1 2 0 0 0 0
[tannenba@localhost test]$ condor_off -peaceful -daemon startd
Sent "Set-Peaceful-Shutdown" command to local startd
Sent "Kill-Daemon-Peacefully" command to local master
[tannenba@localhost test]$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@localhost LINUX X86_64 Claimed Retiring 0.000 330 0+00:00:03
slot2@localhost LINUX X86_64 Unclaimed Idle 0.000 330 0+00:02:49
slot3@localhost LINUX X86_64 Unclaimed Idle 0.000 330 0+00:00:06
Total Owner Claimed Unclaimed Matched Preempting Backfill Drain
X86_64/LINUX 3 0 1 2 0 0 0 0
Total 3 0 1 2 0 0 0 0
On 8/18/2016 11:11 AM, Bob Ball wrote:
> Just as a data point, I do, from our central manager machine,
> condor_off -peaceful -daemon startd -name $publicName
> and it runs just fine. All our jobs are vanilla. HTCondor is version
> 8.4.6 on Scientific Linux.
>
> bob
>
> On 8/18/2016 11:54 AM, Harald van Pee wrote:
>>
>> Hi,
>>
>> I want to set a job running node offline, but only after all running
>> jobs have finished. Of course until then no new jobs should be
>> accepted on that node.
>>
>> I tried the command:
>>
>> condor_off -peaceful -daemon startd node
>>
>> and got the message:
>>
>> Sent "Set-Peaceful-Shutdown" command to startd node
>>
>> Sent "Kill-Daemon-Peacefully" command to master node
>>
>> On node I see in StartLog
>>
>> 08/18/16 17:20:49 Got SIGTERM. Performing graceful shutdown.
>>
>> 08/18/16 17:20:49 shutdown graceful
>>
>> And indeed all jobs running in vannilla universe (we have no others)
>>
>> are killed directly and started from the beginning. This is what a
>>
>> graceful shutdown will do with vanilla jobs. But I want to have a
>> peaceful shutdown.
>>
>> Is a peaceful shutdown not possible for vanilla jobs?
>>
>> Do I have to change the configuration? We use:
>>
>> PREEMPT = FALSE
>>
>> PREEMPTION_REQUIREMENTS = False
>>
>> KILL = FALSE
>>
>> WANT_SUSPEND = false
>>
>> WANT_VACATE = false
>>
>> Or can I use just a different command?
>>
>> We use condor 8.4.8 on debian 8 (AMD64).
>>
>> Thanks
>>
>> Harald
>>
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing Department of Computer Sciences
HTCondor Technical Lead 1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132 Madison, WI 53706-1685