Excellent investigation!
Solution sounds reasonably good.
Could you send us a patch and we can try it out in our internal test setup?
Bongjae Chang wrote:
> I reviewed join notification's logic about (b).
> /(b) when new member joined, this member don't receive some members'
> join notifications.(In other words, this member receives only own
> notification and group leader's notification)/
> This is normal case. e.g) member "B"'s behavior
> 1. When new member("C") joins the group, the group leader(master)
> sends MASTERNODERESPONSE to group members with ADD_EVENT(about "C")
> and own view's snapshot finally.
> 2. Members receive MASTERNODERESPONSE and process
> processMasterNodeResponse().
> 3. In processMasterNodeResponse(), ADD_EVENT notified with master
> view's snapshot by ClusterViewManager.
> 4. Then,ViewWindow analyzes the event packet(ADD_EVENT).
> 5. Finally, members receive a join notification about new member(about
> "C").
> But In new memeber("C"), some problem occurred. There is no logic
> aboutnotifying other members' ADD_EVENT(about "B")
> 1. When new member("C") joins the group, the group leader(master)
> sends MASTERNODERESPONSE to group members with ADD_EVENT and own
> view's snapshot finally.[same above]
> 2. "C" receive MASTERNODERESPONSE and process
> processMasterNodeResponse()[same above]
> 3. In processMasterNodeResponse(), MASTER_CHANGE_EVENT notified
> *without master view's snapshot* because current master is self.
> 4. Then,ViewWindow analyzes the event packet(MASTER_CHANGE_EVENT). Of
> course when ViewWindowreceivesMASTER_CHANGE_EVENT, ViewWindow notifies
> join notifications based onview history if previous view doesn't have
> any members. Maybe this is the logic for notifying other members' join
> notifications in new member("C"). But *current viewbased onevent
> packet(MASTER_CHANGE_EVENT) is not master view unfortunately*. Current
> view has only "C"'s local view(currently only master member and own
> member added). So only master's join notification occurred.
> 5. In processMasterNodeResponse(), ADD_EVENT notified with master
> view's snapshot by ClusterViewManager.[same above]
> 6. Then,ViewWindow analyzes the event packet(ADD_EVENT).[same above]
> 7.new member("C") receivesown join notification.[same above]
> So, I think this problem can be fixed *if MASTER_CHANGE_EVENT notified
> with master view's snapshot *above 3. Then above 4, ViewWindow can
> find that previous view doesn't have other members as well as master
> member. Andthen above 5, In processMasterNode(), ClusterViewManager
> can notifies only ADD_EVENTwithout master view's snapshot
> becauseMASTER_CHANGE_EVENT included master view's snapshot already
> notified .
> Actually, I tried to apply this patch, I could verify
> thatthisproblem(b)is resolved.
> If this issue(b)is indentified as a bug andmy suggestion doesn't have
> a mistake or error logically, I will sendthe patch code to the dev alias.
> Please point out mistakes and I ask you foradvice.
> Thanks.
> --
> Bongjae Chang
>
> ----- Original Message -----
> *From:* Bongjae Chang <mailto:carryel_at_korea.com>
> *To:* dev_at_shoal.dev.java.net <mailto:dev_at_shoal.dev.java.net>
> *Sent:* Thursday, June 05, 2008 2:38 PM
> *Subject:* [Shoal-Dev] Strange behavior about join notifications.
>
> Hi.
> When I tried to testjoin notifications, I foundsome problems.
> Assuming that "A", "B" and "C"are members in "TestGroup".
> Sometimeswhen new memberjoin, this member can't receive join
> notifications of others that already joined.
> This scenario is following.
> 1. First,"A" joined and became a group leader.
> 2. after 1,"B" joined. Then "B" received "A"'s a join notification
> and own("C") join notification in "B". No problem.
> 3. after 2,"C" joined. At this time,"C" must receive "A", "B" and
> "C" join notifications in "C". But "C" didn't receive "B"'s a join
> notification.
> Like above, assuming that "A", "B", "C" and "D" are members in
> "TestGroup", "D" didn't receive "B" and "C"'s join notifications.
> I think there are some bugs.
> (a) above 1, the group leader don't receive own join notification.
> (b) above 3, when new member joined, this member don't receive
> some members' join notifications.(In other words, this member
> receives only own notification and group leader's notification)
> You can also see this resultfromfollowing logs.
> /"A"(the group leader): member
> id="6a92713c-d83e-49a8-8aaa-ad12046a1acb"/
> /"B": member id="77ff0a1c-b9a1-417a-b04c-0028ef6da921"/
> /"C": member id="6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a"/
> /When memebers receive a join notification, "***JoinNotification
> received: ServerName = [MY_MEMBER_ID], Signal.getMemberToken() =
> [MEMBER_ID]"printed./
> ["A"'s log]
> ------------------------------------------------------------------------
> 2008. 6. 5 ¿ÀÈÄ 1:36:17
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Starting SimpleJoinTest....
> 2008. 6. 5 ¿ÀÈÄ 1:36:18
> com.sun.enterprise.shoal.jointest.SimpleJoinTest initializeGMS
> Á¤º¸: Initializing Shoal for member:
> 6a92713c-d83e-49a8-8aaa-ad12046a1acb group:TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:36:18
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Registering for group event notifications
> 2008. 6. 5 ¿ÀÈÄ 1:36:18
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Joining Group TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:36:18
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 2008. 6. 5 ¿ÀÈÄ 1:36:18
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : MASTER_CHANGE_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 2: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : ADD_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:36:44
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 6a92713c-d83e-49a8-8aaa-ad12046a1acb, Signal.getMemberToken() =
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921
> *2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE0F6B7D5CD8CC447180F2D059E273AD5103
> 2: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 3: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : ADD_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:37:03
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 6a92713c-d83e-49a8-8aaa-ad12046a1acb, Signal.getMemberToken() =
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a*
> ------------------------------------------------------------------------
> "A"'s log don't have own join
> notification(6a92713c-d83e-49a8-8aaa-ad12046a1acb).
> ["B"'s log]
> ------------------------------------------------------------------------
> 2008. 6. 5 ¿ÀÈÄ 1:36:40
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Starting SimpleJoinTest....
> 2008. 6. 5 ¿ÀÈÄ 1:36:40
> com.sun.enterprise.shoal.jointest.SimpleJoinTest initializeGMS
> Á¤º¸: Initializing Shoal for member:
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921 group:TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:36:40
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Registering for group event notifications
> 2008. 6. 5 ¿ÀÈÄ 1:36:40
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Joining Group TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : MASTER_CHANGE_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 2: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : MASTER_CHANGE_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921, Signal.getMemberToken() =
> 6a92713c-d83e-49a8-8aaa-ad12046a1acb
> *2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 2: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : ADD_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:36:41
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921, Signal.getMemberToken() =
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921
> *2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE0F6B7D5CD8CC447180F2D059E273AD5103
> 2: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 3: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : ADD_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 77ff0a1c-b9a1-417a-b04c-0028ef6da921, Signal.getMemberToken() =
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a*
> ------------------------------------------------------------------------
> ["C"'s log]
> ------------------------------------------------------------------------
> 2008. 6. 5 ¿ÀÈÄ 1:36:59
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Starting SimpleJoinTest....
> 2008. 6. 5 ¿ÀÈÄ 1:36:59
> com.sun.enterprise.shoal.jointest.SimpleJoinTest initializeGMS
> Á¤º¸: Initializing Shoal for member:
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a group:TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:36:59
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Registering for group event notifications
> 2008. 6. 5 ¿ÀÈÄ 1:36:59
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> Á¤º¸: Joining Group TestGroup
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE0F6B7D5CD8CC447180F2D059E273AD5103
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : MASTER_CHANGE_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE0F6B7D5CD8CC447180F2D059E273AD5103
> 2: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : MASTER_CHANGE_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, Signal.getMemberToken() =
> 6a92713c-d83e-49a8-8aaa-ad12046a1acb
> *2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> Á¤º¸: GMS View Change Received for group TestGroup : Members in
> view for (before change analysis) are :
> 1: MemberId: 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE0F6B7D5CD8CC447180F2D059E273AD5103
> 2: MemberId: 6a92713c-d83e-49a8-8aaa-ad12046a1acb, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CE15F3706F0E794BF595FCEE9EEA90FCE103
> 3: MemberId: 77ff0a1c-b9a1-417a-b04c-0028ef6da921, MemberType:
> CORE, Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC9482BF0C6A44D55B407E7E3A8D1339803
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> Á¤º¸: Analyzing new membership snapshot received as part of event
> : ADD_EVENT
> 2008. 6. 5 ¿ÀÈÄ 1:37:00
> com.sun.enterprise.shoal.jointest.SimpleJoinTest$JoinNotificationCallBack
> processNotification
> *Á¤º¸: ***JoinNotification received: ServerName =
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a, Signal.getMemberToken() =
> 6a8e7161-92ef-4b9e-a5e1-d9a8c7665b4a*
> ------------------------------------------------------------------------
> "C"'s log don't have "B"'s join
> notification(77ff0a1c-b9a1-417a-b04c-0028ef6da921).
> And I attacheda simple test code.
> PS) When I call GMSFactory.startGMSModule()
> withoutpropertiesparam(like null),simple NPE occurred. The
> following is exception.
> ------------------------------------------------------------------------
> Exception in thread "main" java.lang.NullPointerException
> at
> com.sun.enterprise.jxtamgmt.ClusterManager.<init>(ClusterManager.java:161)
> at
> com.sun.enterprise.ee.cms.impl.jxta.GroupCommunicationProviderImpl.initializeGroupCommunicationProvider(GroupCommunicationProviderImpl.java:138)
> at
> com.sun.enterprise.ee.cms.impl.jxta.GMSContext.join(GMSContext.java:122)
> at
> com.sun.enterprise.ee.cms.impl.common.GroupManagementServiceImpl.join(GroupManagementServiceImpl.java:331)
> at
> com.sun.enterprise.shoal.jointest.SimpleJoinTest.runSimpleSample(SimpleJoinTest.java:40)
> at
> com.sun.enterprise.shoal.jointest.SimpleJoinTest.main(SimpleJoinTest.java:20)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)
> ------------------------------------------------------------------------
> In ClusterManager.java:161
> ------------------------------------------------------------------------
> /this.bindInterfaceAddress =
> (String)props.get(JxtaConfigConstants.BIND_INTERFACE_ADDRESS.toString());/
> ------------------------------------------------------------------------
> Maybe, this is "Fix for the power outage issue"'s side-effect by
> sheetalv. :-)
> --
> Bongjae Chang
>
> ------------------------------------------------------------------------
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>