Re: [Shoal-Dev] About sailfin issue #484

From: Joseph Fialli <Joseph.Fialli_at_Sun.COM>
Date: Mon, 01 Jun 2009 17:03:08 -0400

Bongjae,

See my comments inline below.

Bongjae Chang wrote:
> Hi,
> I have a question about sailfin issue #484 relating to
> MasterNode#processMasterNodeQuery()'s changes.
> I tried to test the master's failure.
> This test is like sailfin issue #484.
> i.g. the master dies and comes back up quickly.
> It seems that the policy and behavior about the failed master has been
> changed from sailfin issue #484.
> The changes select a new master and notify a join notification about
> the old master in only new master.
> This result was not my expectaion because the old master didn't have a
> failure state at other members.
Please see the following glassfish issue concerning fast restart of a
failed instance.
https://glassfish.dev.java.net/issues/show_bug.cgi?id=8308

To summarize, GMS heartbeat detection (default of 7.5 seconds in
Glassfish) is not able to detect
and report FAILURE event when the glassfish NodeAgent automatically
restarts an instance in less than
7.5 seconds. The instance has truely failed regardless if it is reported
by a GMS failure event.
It is not possible to send out a GMS FAILURE event once the instance has
already restarted.
That is what is discussed in much detail in glassfish issue 8308 and the
ability to augment GMS failure
detection when an external agent is restarting failed instances faster
than gms heartbeat detection.

The restarted instance is missing all state that the previous Master
instance did have. It was a bug in sailfin 484 that the failure went
undetected.
It was not a policy change but a bug fix.

Here is how GMS failure detection works at a high level.
- The MasterNode monitors all other instance heartbeats in a cluster for
failure.
- All other instances in the cluster monitor the MasterNode heartbeats
to check if it failed.

Once the MasterNode is killed and comes back up quickly, ALL other
instances in the cluster
(not just the master node) will see a MasterNodeQuery. ALL OTHER
INSTANCES recognize the
former master node has restarted and that there is a need to recalculate
who is the new Master from the surviving cluster instances since the
newly restarted former master is missing all state
(which instances make up the cluster).
Only the surviving instances of the cluster have been keeping that
information and are quallified to be new Master.
Whichever instance is made the new Master (based on an algorithm that
all instances are applying to their list of instances making up the cluster)
all instances will agree on new Master.

Only the newly elected Master sends out the join notification of the
restarted old Master instance. That was the fix that
was checked in for sailfin 484. All other instances of the cluster will
receive this join notification.

I hope this explains the motivation behind the fix for sailfin 484.
It was not intended to be a policy change.

-Joe

> I thought that the old master should keep master' role if the old
> master came back up quickly before others were aware of the old
> master's failure.
> And the changes are only notifying the old master's join notification
> in a new master.
> Assume that A, B and C are members and A is the master.
> When A dies and comes back quickly, B becomes to be a new master and B
> receives A's join notification. Maybe C doesn't receive A's join
> notification because A is not only failure member but also indoubt
> member. I think that C's behavior is right.
> Assume that A, B and C are members and A is the master again.
> When B dies and comes back quickly, both A and C doesn't receive join
> notifications because B is not indoubt member as well as failure
> member. I think that this behavior is also right.
> When the old master dies and rejoins the group quickly, the old master
> perhaps discovers the group's master. But the group doesn't have the
> master because the old master itself has been the group master. Then
> the old master which rejoins the group will wait for discovery time.
> During discovery time, maybe all members can't receive the group's
> event adequately.
> So is the new master selected in order to save discovery time instead
> of the old master?
> And should we give the old master's join notification special
> treatment when the old master dies and comes back?
> What do you think?
> Thanks!
> --
> Bongjae Cha