dev@shoal.java.net

Re: [Shoal-Dev] about cluster view ID in ViewWindow

From: Shreedhar Ganapathy <Shreedhar.Ganapathy_at_Sun.COM>
Date: Mon, 16 Jun 2008 23:45:38 -0700

The view id should just be the sequence id that the master node used
when sending the view. The current view id incrementing being done
locally is not a good solution.
When a master changes, a view id reset will be needed to whatever the
new master sets.

Bongjae Chang wrote:
> I think that it can also besimple solution thatwe use synchronized
> block at notifyListeners() if there is no performance issue.
> like this:
> synchronized void notifyListeners() {
> ...
> }
> Thanks.
> --
> Bongjae Chang
>
> ----- Original Message -----
> *From:* Bongjae Chang <mailto:carryel_at_korea.com>
> *To:* dev_at_shoal.dev.java.net <mailto:dev_at_shoal.dev.java.net>
> *Sent:* Tuesday, June 17, 2008 2:43 PM
> *Subject:* Re: [Shoal-Dev] about cluster view ID in ViewWindow
>
> Hi Shreedhar.
> Yes, this ID is generated locally when cluster view manager
> notifies listeners.
> In ClusterViewManager.java
> void notifyListeners( ClusterViewEvent event ){
> for(ClusterViewEventListener elem:cvListeners) {
> elem.clusterViewEvent(event, *getLocalView()*);
> }
> }
> public ClusterView getLocalView() {
> ...
> return new ClusterView(temp, *viewId++*);
> }
> I thinkwhen ClusterViewManager.notifyListeners()is
> calledconcurrently in separate threads, ViewWindow can receive
> ClusterView.viewId disorderly.
> In other words, though cluster view manager's view is updated by
> being sent as part of the master view, when cluster view manager
> snapshoot own view locally,
> sometimes this result can be occurred.
> Thanks
> --
> Bongjae Chang
>
> ----- Original Message -----
> *From:* Shreedhar Ganapathy <mailto:Shreedhar.Ganapathy_at_Sun.COM>
> *To:* dev_at_shoal.dev.java.net <mailto:dev_at_shoal.dev.java.net>
> *Sent:* Tuesday, June 17, 2008 2:13 PM
> *Subject:* Re: [Shoal-Dev] about cluster view ID in ViewWindow
>
> I have a feeling this ID is spuriously generated locally
> instead of being sent as part of the master view. Could you
> check? It may not be a sequence problem.
>
> Bongjae Chang wrote:
>> Hi.
>> While Iwas testing Shoal' Join and Joined and Ready Event, I
>> found some curios result once in awhile.
>> When ViewWindow received ClusterViews, ClusterViews was not
>> serialized.
>> I could confirm this issue from the following log. Please see
>> only ClusterView's ID.
>> The "seq" means ClusterView.getClusterViewId().
>> --------------------
>> ...
>> 2008. 6. 17 ¿ÀÀü 10:54:11
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=0*
>> (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2008. 6. 17 ¿ÀÀü 10:54:11
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : MASTER_CHANGE_EVENT
>> 2008. 6. 17 ¿ÀÀü 10:54:16
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=1*
>> (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2008. 6. 17 ¿ÀÀü 10:54:16
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : MASTER_CHANGE_EVENT
>> 2008. 6. 17 ¿ÀÀü 10:54:19
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=4*
>> (aa7bffab-719d-4312-bf3f-d6cce8d516bc) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
>> 3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> 2008. 6. 17 ¿ÀÀü 10:54:19
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : ADD_EVENT
>> JMS processJoinedAndReady:5ad203f8-c469-4239-86a2-da9b026677d6
>> ***JoinNotification received: state = READY,GroupLeader =
>> true, Signal.getMemberToken() =
>> 5ad203f8-c469-4239-86a2-da9b026677d6, Leader =
>> 5ad203f8-c469-4239-86a2-da9b026677d6
>> JMS processJoinedAndReady:aa7bffab-719d-4312-bf3f-d6cce8d516bc
>> JMS processJoinedAndReady:e9fa5e76-9518-4149-b415-b46a7f16111a
>> *2008. 6. 17 ¿ÀÀü 10:54:25*
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=6*
>> (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
>> 3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> 2008. 6. 17 ¿ÀÀü 10:54:25
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : ADD_EVENT
>> *2008. 6. 17 ¿ÀÀü 10:54:25*
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=3*
>> (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> *2008. 6. 17 ¿ÀÀü 10:54:25*
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : ADD_EVENT
>> 2008. 6. 17 ¿ÀÀü 10:54:25
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=11*
>> (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
>> 3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> 2008. 6. 17 ¿ÀÀü 10:54:25
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : ADD_EVENT
>> ****JoinNotification received: state =
>> ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() =
>> aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader =
>> 5ad203f8-c469-4239-86a2-da9b026677d6*
>> ***JoinNotification received: state =
>> ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() =
>> e9fa5e76-9518-4149-b415-b46a7f16111a, Leader =
>> 5ad203f8-c469-4239-86a2-da9b026677d6
>> 2008. 6. 17 ¿ÀÀü 10:54:28
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=13*
>> (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
>> 3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> 2008. 6. 17 ¿ÀÀü 10:54:28
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : JOINED_AND_READY_EVENT
>> 2008. 6. 17 ¿ÀÀü 10:54:28
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>> Á¤º¸: GMS View Change Received for group JEUS,*seq=14*
>> (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for
>> (before change analysis) are :
>> 1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
>> 2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
>> 3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a,
>> MemberType: CORE, Address:
>> urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03
>> 2008. 6. 17 ¿ÀÀü 10:54:28
>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>> Á¤º¸: Analyzing new membership snapshot received as part of
>> event : JOINED_AND_READY_EVENT
>> ****JoinNotification received: state =
>> ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() =
>> aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader =
>> 5ad203f8-c469-4239-86a2-da9b026677d6
>> *
>> --------------------
>> As you know,cluster view manager calls getLocalView(), and
>> when getLocalView() is called, ClusterView's Id is increased
>> before cluster view manager notifies listeners.
>> So this log means that ViewWindow can receive old view for a
>> while in concurrent case. I can understand this case because
>> cluster view manager can notify listeners in seperate
>> threads. ex) in MasterNode and HealthMonitor Threads
>> But newViewObserved() and getMemberTokens() methods in
>> ViewWindow always update cluster view without checking ID.
>> If cluster view is alwaysupdated, unexpected results can be
>> occurred. ex) above log, join notification can be duplicated
>> because sometimes we use view's history for notifying joining.
>> When I review ViewWindow for this problem's patch, I don't
>> know how tocontrolold view.
>> case 1) old view should not be inserted in view's history but
>> signals should be queuing if necessary.
>> - But in this case, when user receive signals, current core
>> members and all current members don't belong to currentview.
>> case 2) old view could be inserted in view's histroy. Butwe
>> should consider that view's history can be newer than current
>> packet innotifying joining algorithm and generating failure
>> recovery signals.
>> - But in this case, I think it is strange that view's history
>> is not serialized.
>> I want to receive your opinions and the way of possible solution.
>> Please advice me.
>> Thanks.
>> --
>> Bongjae Chang
>