Hi,
I have a question about sailfin issue #484 relating to MasterNode#processMasterNodeQuery()'s changes.
I tried to test the master's failure.
This test is like sailfin issue #484.
i.g. the master dies and comes back up quickly.
It seems that the policy and behavior about the failed master has been changed from sailfin issue #484.
The changes select a new master and notify a join notification about the old master in only new master.
This result was not my expectaion because the old master didn't have a failure state at other members.
I thought that the old master should keep master' role if the old master came back up quickly before others were aware of the old master's failure.
And the changes are only notifying the old master's join notification in a new master.
Assume that A, B and C are members and A is the master.
When A dies and comes back quickly, B becomes to be a new master and B receives A's join notification. Maybe C doesn't receive A's join notification because A is not only failure member but also indoubt member. I think that C's behavior is right.
Assume that A, B and C are members and A is the master again.
When B dies and comes back quickly, both A and C doesn't receive join notifications because B is not indoubt member as well as failure member. I think that this behavior is also right.
When the old master dies and rejoins the group quickly, the old master perhaps discovers the group's master. But the group doesn't have the master because the old master itself has been the group master. Then the old master which rejoins the group will wait for discovery time. During discovery time, maybe all members can't receive the group's event adequately.
So is the new master selected in order to save discovery time instead of the old master?
And should we give the old master's join notification special treatment when the old master dies and comes back?
What do you think?
Thanks!
--
Bongjae Chang