dev@shoal.java.net

about cluster view ID in ViewWindow

From: Bongjae Chang <carryel_at_korea.com>
Date: Tue, 17 Jun 2008 13:33:00 +0900

Hi.
While I was testing Shoal' Join and Joined and Ready Event, I found some curios result once in a while.
When ViewWindow received ClusterViews, ClusterViews was not serialized.

I could confirm this issue from the following log. Please see only ClusterView's ID.
The "seq" means ClusterView.getClusterViewId().
--------------------
...
2008. 6. 17 ¿ÀÀü 10:54:11 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=0 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503

2008. 6. 17 ¿ÀÀü 10:54:11 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : MASTER_CHANGE_EVENT
2008. 6. 17 ¿ÀÀü 10:54:16 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=1 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503

2008. 6. 17 ¿ÀÀü 10:54:16 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : MASTER_CHANGE_EVENT

2008. 6. 17 ¿ÀÀü 10:54:19 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=4 (aa7bffab-719d-4312-bf3f-d6cce8d516bc) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:19 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
JMS processJoinedAndReady:5ad203f8-c469-4239-86a2-da9b026677d6
***JoinNotification received: state = READY,GroupLeader = true, Signal.getMemberToken() = 5ad203f8-c469-4239-86a2-da9b026677d6, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
JMS processJoinedAndReady:aa7bffab-719d-4312-bf3f-d6cce8d516bc
JMS processJoinedAndReady:e9fa5e76-9518-4149-b415-b46a7f16111a
2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=6 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=3 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=11 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = e9fa5e76-9518-4149-b415-b46a7f16111a, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=13 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : JOINED_AND_READY_EVENT
2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
Á¤º¸: GMS View Change Received for group JEUS,seq=14 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
Á¤º¸: Analyzing new membership snapshot received as part of event : JOINED_AND_READY_EVENT
***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6

--------------------

As you know, cluster view manager calls getLocalView(), and when getLocalView() is called, ClusterView's Id is increased before cluster view manager notifies listeners.
So this log means that ViewWindow can receive old view for a while in concurrent case. I can understand this case because cluster view manager can notify listeners in seperate threads. ex) in MasterNode and HealthMonitor Threads

But newViewObserved() and getMemberTokens() methods in ViewWindow always update cluster view without checking ID.
If cluster view is always updated, unexpected results can be occurred. ex) above log, join notification can be duplicated because sometimes we use view's history for notifying joining.

When I review ViewWindow for this problem's patch, I don't know how to control old view.
case 1) old view should not be inserted in view's history but signals should be queuing if necessary.
    - But in this case, when user receive signals, current core members and all current members don't belong to current view.
case 2) old view could be inserted in view's histroy. But we should consider that view's history can be newer than current packet in notifying joining algorithm and generating failure recovery signals.
    - But in this case, I think it is strange that view's history is not serialized.

I want to receive your opinions and the way of possible solution.
Please advice me.

Thanks.

--
Bongjae Chang