Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Persistent offline machines ads when using a third party power management tool?
- Date: Wed, 1 Feb 2012 08:27:11 -0500
- From: Ian Chesal <ichesal@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Persistent offline machines ads when using a third party power management tool?
Ah. Reading the ticket I thought it made it in to 7.6.4. Thanks.
- Ian
On 2012-02-01, at 6:44 AM, Lukas Slebodnik <slebodnik@xxxxxxxx> wrote:
> Just one notice. Patch from ticket
> https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2564
> is not included in any condor release.
>
> Regards,
> Lukas
>
> On Tue, Jan 31, 2012 at 03:07:51PM -0500, Ian Chesal wrote:
>> Hi Lukas,
>>
>> Maybe. I'm not sure OFFLINE=True is ever being set. But I never considered that it could be getting set and the unset before hibernation kicked it.
>>
>> I'll try extending HIBERNATION_WAIT_INTERVAL to see if it helps the situation.
>>
>> Regards,
>> - Ian
>>
>>
>> ---
>> Ian Chesal
>>
>> Cycle Computing, LLC
>> Leader in Open Compute Solutions for Clouds, Servers, and Desktops
>> Enterprise Condor Support and Management Tools
>>
>> http://www.cyclecomputing.com
>> http://www.cyclecloud.com
>> http://twitter.com/cyclecomputing
>>
>>
>> On Tuesday, 31 January, 2012 at 2:56 PM, slebodnik wrote:
>>
>>> Are symptoms similar like in ticket
>>> https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2564
>>>
>>> Regards,
>>> Lukas
>>>
>>> On Tue, 31 Jan 2012 13:39:26 -0500, Ian Chesal
>>> <ichesal@xxxxxxxxxxxxxxxxxx (mailto:ichesal@xxxxxxxxxxxxxxxxxx)> wrote:
>>>> It seems like it should work but in my 7.6.5 pool any machine that's
>>>> hibernated by third party (external to Condor) power management
>>>> software fails to end up with an OFFLINE=True attribute in the
>>>> machine
>>>> ad at the collector and subsequently disappears from my list of
>>>> machines so it cannot be woken up by Rooster.
>>>>
>>>> Condor appears to know that it's being hibernated. I see the
>>>> following the MasterLog for the machine when the third party tool
>>>> starts the machine hibernation procedure:
>>>>
>>>> 01/26/12 09:17:13 PowerEventHander: Some driver/application is asking
>>>> if we can enter hibernation
>>>> 01/26/12 09:17:15 PowerEventHander: Machine entering hibernation
>>>>
>>>> Looking through the source it doesn't appear that the event handler
>>>> for this event does anything. There's no sign that it's updating the
>>>> machine ad to let the collector know it's going offline. When the
>>>> collector-side OfflineCollectorPlugin runs the ad is purged, not off
>>>> lined. If I set the offline attribute on the machine ad to true
>>>> before
>>>> hibernating the machine by hand everything works. Unfortunately I
>>>> don't seem to be able to run a script from the hibernation tool
>>>> that's
>>>> in use, so I can't (at least not without great difficulty) follow
>>>> this
>>>> approach in the third party tool.
>>>>
>>>> Is it not possible to have this third party hibernation offline the
>>>> machine ad when the hibernate signal is trapped?
>>>>
>>>> (This is on Windows BTW…)
>>>>
>>>> Regards,
>>>> - Ian
>>>>
>>>> ---
>>>>
>>>> Ian Chesal
>>>>
>>>> Cycle Computing, LLC
>>>> Leader in Open Compute Solutions for Clouds, Servers, and Desktops
>>>> Enterprise Condor Support and Management Tools
>>>>
>>>> http://www.cyclecomputing.com
>>>> http://www.cyclecloud.com
>>>> http://twitter.com/cyclecomputing
>>>>
>>>
>>>
>>>
>>
>>
>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>