dev@shoal.java.net

Re: [Shoal-Dev] About sailfin issue #484

From: Bongjae Chang <carryel_at_korea.com>
Date: Fri, 5 Jun 2009 09:55:37 +0900

Hi Joe,

I understood it.

Thank you very much!

--
Bongjae Chang


----- Original Message -----
From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
To: <dev_at_shoal.dev.java.net>
Sent: Friday, June 05, 2009 4:40 AM
Subject: Re: [Shoal-Dev] About sailfin issue #484


> Bongjae Chang wrote:
>> Hi Joe,
>>
>>
>>
> <deleted resolved issue>
>> I agree. So now I understand why the changes have given the master a special treatment. And I could also understand why WATCH_DOG was needed after I had seen the glassfish issue #8308. :-)
>>
>> About the glassfish issue #8308, I have a question.
>>
>> If the server which uses Shoal doesn't have a node agent which supports WATCH_DOG, the FAILURE event could be lost at this case, couldn't it?
>>
> The FAILURE event is never sent because the instance is restarted in a
> shorter period of time than GMS heartbeat failure detection can detect
> the failure.
>> Then the new master ought to notify the old master's FAILURE. is it Right?
>>
> Here is the dilema.
>
> If an instance fails and restarts, the Shoal system is detecting the
> RESTARTED instance independent of WATCHDOG.
> At the time one detects that an instance has RESTARTED, it seems
> confusing to send out the fact that
> a previous instantiation of the issue in the past had failed.
>
> Once an instance has restarted, one could easily confuse the FAILURE
> event with the restarted instance.
> The Shoal internals is differentiating between the two different
> instantiations by consulting the
> start time of the instance. (Each heartbeat has a START TIME that
> records the time the instance joined the group.)
> However, that is not typical Shoal protocol for users of Shoal API.
> Thus it is better to miss the FAILURE event and just log that the
> FAILURE event was missed.
> (as documented under glassfish issue 8308.)
>
> Here is the sequence that is occurring.
>
> Instance A (started at time XX)
> Instance A (fails at time YY)
> Instance A (restarted at time ZZ)
>
>
> Timeline
> ---------+---------------------------+-------------------------------+--------
> XX
> YY ZZ
>
>
> It is not ambiguous to send a FAILURE notification between times YY and ZZ.
> However, once one hits time ZZ, it is ambiguous whether the FAILURE
> applies to Instance A started at XX
> or instance A started at ZZ. Also, what benefit is it to know that
> Instance A started at XX failed if Instance A
> has restarted at time ZZ. The above occurs anytime that the time
> between ZZ- YY is less than amount
> of time GMS heartbeat failure detection needs to detect failure of an
> instance. As of this writing, only the Glassfish
> Nodeagent is known to restart an instance of a cluster in shorter period
> time than glassfish
> default GMS heartbeat failure detection time.
>
> Here are the log messages detecting that the FAILURE event was never
> sent due to fast restart.
> These log messages are recorded in MasterNode(typically the DAS) when it
> receives GMS heartbeat STARTING from Instance
> A started at time ZZ and the system realizes the heartbeat is from a
> different instance A than the
> last recorded one (that had started at time XX).
>
>
> [#|...|WARNING|sun-glassfish-comms-server1.5|ShoalLogger|...;
> Instance n2c1m4 was restarted at 4:13:19 PM PST on Feb 4, 2009.|#]
>
> [#|...|WARNING|sun-glassfish-comms-server1.5|ShoalLogger|...;
> Note that there was no Failure notification sent out for this instance
> that was
> previously started at 4:11:31 PM PST on Feb 4, 2009|#]
>
> Complete description at
> https://glassfish.dev.java.net/issues/show_bug.cgi?id=8308
>
> Hope this explains why sending late FAILURE notifications is only logged
> and that only
> WATCHDOG capability is able to send a timely FAILURE event for an
> instance that dies
> and is quicly restarted by an external agent (in glassfish's case the
> NodeAgent.)
>
> -Joe
>
>
>
>
>
>
>
>
>
>> Thanks in advance.
>>
>> PS) Didn't you join the Javaone events with Shreedhar? Unfortunately, I couldn't join there this year. But I wish that I will attend next Javaone and meet you and many Shoal's users and devs next year!
>>
>> --
>> Bongjae Chang
>>
>>
>> ----- Original Message -----
>> From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
>> To: <dev_at_shoal.dev.java.net>
>> Sent: Tuesday, June 02, 2009 6:03 AM
>> Subject: Re: [Shoal-Dev] About sailfin issue #484
>>
>>
>>
>>> Bongjae,
>>>
>>> See my comments inline below.
>>>
>>>
>>> Bongjae Chang wrote:
>>>
>>>> Hi,
>>>> I have a question about sailfin issue #484 relating to
>>>> MasterNode#processMasterNodeQuery()'s changes.
>>>> I tried to test the master's failure.
>>>> This test is like sailfin issue #484.
>>>> i.g. the master dies and comes back up quickly.
>>>> It seems that the policy and behavior about the failed master has been
>>>> changed from sailfin issue #484.
>>>> The changes select a new master and notify a join notification about
>>>> the old master in only new master.
>>>> This result was not my expectaion because the old master didn't have a
>>>> failure state at other members.
>>>>
>>> Please see the following glassfish issue concerning fast restart of a
>>> failed instance.
>>> https://glassfish.dev.java.net/issues/show_bug.cgi?id=8308
>>>
>>> To summarize, GMS heartbeat detection (default of 7.5 seconds in
>>> Glassfish) is not able to detect
>>> and report FAILURE event when the glassfish NodeAgent automatically
>>> restarts an instance in less than
>>> 7.5 seconds. The instance has truely failed regardless if it is reported
>>> by a GMS failure event.
>>> It is not possible to send out a GMS FAILURE event once the instance has
>>> already restarted.
>>> That is what is discussed in much detail in glassfish issue 8308 and the
>>> ability to augment GMS failure
>>> detection when an external agent is restarting failed instances faster
>>> than gms heartbeat detection.
>>>
>>> The restarted instance is missing all state that the previous Master
>>> instance did have. It was a bug in sailfin 484 that the failure went
>>> undetected.
>>> It was not a policy change but a bug fix.
>>>
>>> Here is how GMS failure detection works at a high level.
>>> - The MasterNode monitors all other instance heartbeats in a cluster for
>>> failure.
>>> - All other instances in the cluster monitor the MasterNode heartbeats
>>> to check if it failed.
>>>
>>> Once the MasterNode is killed and comes back up quickly, ALL other
>>> instances in the cluster
>>> (not just the master node) will see a MasterNodeQuery. ALL OTHER
>>> INSTANCES recognize the
>>> former master node has restarted and that there is a need to recalculate
>>> who is the new Master from the surviving cluster instances since the
>>> newly restarted former master is missing all state
>>> (which instances make up the cluster).
>>> Only the surviving instances of the cluster have been keeping that
>>> information and are quallified to be new Master.
>>> Whichever instance is made the new Master (based on an algorithm that
>>> all instances are applying to their list of instances making up the cluster)
>>> all instances will agree on new Master.
>>>
>>> Only the newly elected Master sends out the join notification of the
>>> restarted old Master instance. That was the fix that
>>> was checked in for sailfin 484. All other instances of the cluster will
>>> receive this join notification.
>>>
>>> I hope this explains the motivation behind the fix for sailfin 484.
>>> It was not intended to be a policy change.
>>>
>>> -Joe
>>>
>>>
>>>> I thought that the old master should keep master' role if the old
>>>> master came back up quickly before others were aware of the old
>>>> master's failure.
>>>> And the changes are only notifying the old master's join notification
>>>> in a new master.
>>>> Assume that A, B and C are members and A is the master.
>>>> When A dies and comes back quickly, B becomes to be a new master and B
>>>> receives A's join notification. Maybe C doesn't receive A's join
>>>> notification because A is not only failure member but also indoubt
>>>> member. I think that C's behavior is right.
>>>> Assume that A, B and C are members and A is the master again.
>>>> When B dies and comes back quickly, both A and C doesn't receive join
>>>> notifications because B is not indoubt member as well as failure
>>>> member. I think that this behavior is also right.
>>>> When the old master dies and rejoins the group quickly, the old master
>>>> perhaps discovers the group's master. But the group doesn't have the
>>>> master because the old master itself has been the group master. Then
>>>> the old master which rejoins the group will wait for discovery time.
>>>> During discovery time, maybe all members can't receive the group's
>>>> event adequately.
>>>> So is the new master selected in order to save discovery time instead
>>>> of the old master?
>>>> And should we give the old master's join notification special
>>>> treatment when the old master dies and comes back?
>>>> What do you think?
>>>> Thanks!
>>>> --
>>>> Bongjae Cha
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>>>
>>>
>>>
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>
>
>
>