dev@shoal.java.net

Re: [Shoal-Dev] about cluster view ID in ViewWindow

From: Bongjae Chang <carryel_at_korea.com>
Date: Tue, 17 Jun 2008 16:07:43 +0900

Thanks for quick reply.

OK. Then if I can't use ClusterView.viewId in original code, how about synchronized block at notifyListeners()'s method?
-------------------------------------------------------------------
getLocalView() --> local view's snapshot --> (hole) --> insert view queue
-------------------------------------------------------------------

As you can see above, before EventPacket is inserted into view queue, there is some hole.
Actually if this hole can be removed, ClusterView.viewId in ViewWindow is unnecessary in this issue.

--
Bongjae Chang


  ----- Original Message -----
  From: Shreedhar Ganapathy
  To: dev_at_shoal.dev.java.net
  Sent: Tuesday, June 17, 2008 3:45 PM
  Subject: Re: [Shoal-Dev] about cluster view ID in ViewWindow


  The view id should just be the sequence id that the master node used when sending the view. The current view id incrementing being done locally is not a good solution.
  When a master changes, a view id reset will be needed to whatever the new master sets.

  Bongjae Chang wrote:
    I think that it can also be simple solution that we use synchronized block at notifyListeners() if there is no performance issue.

    like this:
    synchronized void notifyListeners() {
        ...
    }

    Thanks.
    --
    Bongjae Chang


      ----- Original Message -----
      From: Bongjae Chang
      To: dev_at_shoal.dev.java.net
      Sent: Tuesday, June 17, 2008 2:43 PM
      Subject: Re: [Shoal-Dev] about cluster view ID in ViewWindow


      Hi Shreedhar.
      Yes, this ID is generated locally when cluster view manager notifies listeners.

      In ClusterViewManager.java

      void notifyListeners( ClusterViewEvent event ){
          for(ClusterViewEventListener elem:cvListeners) {
              elem.clusterViewEvent(event, getLocalView());
          }
      }

      public ClusterView getLocalView() {
          ...
          return new ClusterView(temp, viewId++);
      }

      I think when ClusterViewManager.notifyListeners() is called concurrently in separate threads, ViewWindow can receive ClusterView.viewId disorderly.
      In other words, though cluster view manager's view is updated by being sent as part of the master view, when cluster view manager snapshoot own view locally,
      sometimes this result can be occurred.

      Thanks

      --
      Bongjae Chang


        ----- Original Message -----
        From: Shreedhar Ganapathy
        To: dev_at_shoal.dev.java.net
        Sent: Tuesday, June 17, 2008 2:13 PM
        Subject: Re: [Shoal-Dev] about cluster view ID in ViewWindow


        I have a feeling this ID is spuriously generated locally instead of being sent as part of the master view. Could you check? It may not be a sequence problem.

        Bongjae Chang wrote:
          Hi.
          While I was testing Shoal' Join and Joined and Ready Event, I found some curios result once in a while.
          When ViewWindow received ClusterViews, ClusterViews was not serialized.

          I could confirm this issue from the following log. Please see only ClusterView's ID.
          The "seq" means ClusterView.getClusterViewId().
          --------------------
          ...
          2008. 6. 17 ¿ÀÀü 10:54:11 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=0 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503

          2008. 6. 17 ¿ÀÀü 10:54:11 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : MASTER_CHANGE_EVENT
          2008. 6. 17 ¿ÀÀü 10:54:16 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=1 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503

          2008. 6. 17 ¿ÀÀü 10:54:16 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : MASTER_CHANGE_EVENT

          2008. 6. 17 ¿ÀÀü 10:54:19 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=4 (aa7bffab-719d-4312-bf3f-d6cce8d516bc) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
          3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:19 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
          JMS processJoinedAndReady:5ad203f8-c469-4239-86a2-da9b026677d6
          ***JoinNotification received: state = READY,GroupLeader = true, Signal.getMemberToken() = 5ad203f8-c469-4239-86a2-da9b026677d6, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
          JMS processJoinedAndReady:aa7bffab-719d-4312-bf3f-d6cce8d516bc
          JMS processJoinedAndReady:e9fa5e76-9518-4149-b415-b46a7f16111a
          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=6 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
          3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=3 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=11 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
          3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:25 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : ADD_EVENT
          ***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
          ***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = e9fa5e76-9518-4149-b415-b46a7f16111a, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6
          2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=13 (e9fa5e76-9518-4149-b415-b46a7f16111a) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
          3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : JOINED_AND_READY_EVENT
          2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
          Á¤º¸: GMS View Change Received for group JEUS,seq=14 (5ad203f8-c469-4239-86a2-da9b026677d6) : Members in view for (before change analysis) are :
          1: MemberId: 5ad203f8-c469-4239-86a2-da9b026677d6, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7C1D5E85B98D5148F2ACA6F7551C6CBE0503
          2: MemberId: aa7bffab-719d-4312-bf3f-d6cce8d516bc, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFB79C2C9C58644848A82BEE8D54099C803
          3: MemberId: e9fa5e76-9518-4149-b415-b46a7f16111a, MemberType: CORE, Address: urn:jxta:uuid-6AC033641A804B22A99AA1BD7DA33B7CFDBAE602FE0A4D4D933DEA0A2C5907FA03

          2008. 6. 17 ¿ÀÀü 10:54:28 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
          Á¤º¸: Analyzing new membership snapshot received as part of event : JOINED_AND_READY_EVENT
          ***JoinNotification received: state = ALIVEANDREADY,GroupLeader = true, Signal.getMemberToken() = aa7bffab-719d-4312-bf3f-d6cce8d516bc, Leader = 5ad203f8-c469-4239-86a2-da9b026677d6

          --------------------

          As you know, cluster view manager calls getLocalView(), and when getLocalView() is called, ClusterView's Id is increased before cluster view manager notifies listeners.
          So this log means that ViewWindow can receive old view for a while in concurrent case. I can understand this case because cluster view manager can notify listeners in seperate threads. ex) in MasterNode and HealthMonitor Threads

          But newViewObserved() and getMemberTokens() methods in ViewWindow always update cluster view without checking ID.
          If cluster view is always updated, unexpected results can be occurred. ex) above log, join notification can be duplicated because sometimes we use view's history for notifying joining.

          When I review ViewWindow for this problem's patch, I don't know how to control old view.
          case 1) old view should not be inserted in view's history but signals should be queuing if necessary.
              - But in this case, when user receive signals, current core members and all current members don't belong to current view.
          case 2) old view could be inserted in view's histroy. But we should consider that view's history can be newer than current packet in notifying joining algorithm and generating failure recovery signals.
              - But in this case, I think it is strange that view's history is not serialized.

          I want to receive your opinions and the way of possible solution.
          Please advice me.

          Thanks.

          --
          Bongjae Chang