users@shoal.java.net

Re: [Shoal-Users] Still not sure it's working

From: Shreedhar Ganapathy <Shreedhar.Ganapathy_at_Sun.COM>
Date: Sun, 29 Jun 2008 23:28:13 -0700

Hi Bongjae
You are right about this fix being partial and I saw this earlier today
while running a simple test.

We do need to think about this a bit more in terms of duplicate joins.
I will look into the provider code tomorrow in MasterNode to see how it
treats members who are not in the previous view but for whom an Add
event has not been fired.
If you get to it sooner, let me know.

Thanks
Shreedhar

Bongjae Chang wrote:
> Hi Shreedhar and Mike.
>
> I know Shreedhar is waiting for Mike's response about Issue
> #66(https://shoal.dev.java.net/issues/show_bug.cgi?id=66)
>
> But I think maybe this patch is not enough.
>
> I expect that Server2 will receive duplicated join notification.
>
> Here is my logs.
>
> Assume that Server1's server token is
> 517a0864-1a11-4474-b40a-aff1ce79c12e and Server2's server token is
> 347f676f-5300-4f40-a409-7ee2d7882c14.
>
>
> [Server1]
> -------------------------------------------------------------------
> 2008. 6. 30 오후 1:22:57
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Starting SimpleJoinTest....
> 2008. 6. 30 오후 1:22:58
> com.sun.enterprise.shoal.jointest.SimpleJoinTest initializeGMS
> 정보: Initializing Shoal for member:
> 517a0864-1a11-4474-b40a-aff1ce79c12e group:TestGroup
> 2008. 6. 30 오후 1:22:58
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Registering for group event notifications
> 2008. 6. 30 오후 1:22:58
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Joining Group TestGroup
> 2008. 6. 30 오후 1:22:59
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> 정보: GMS View Change Received for group TestGroup : Members in view
> for (before change analysis) are :
> 1: MemberId: 517a0864-1a11-4474-b40a-aff1ce79c12e, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEAE124E2EAD144600A358C3E2A889E9F203
>
> 2008. 6. 30 오후 1:22:59
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> 정보: Analyzing new membership snapshot received as part of event :
> MASTER_CHANGE_EVENT
> ****JoinNotification received: GroupLeader = false,
> Signal.getMemberToken() = 517a0864-1a11-4474-b40a-aff1ce79c12e,
> ServerName = 517a0864-1a11-4474-b40a-aff1ce79c12e, Leader =
> 517a0864-1a11-4474-b40a-aff1ce79c12e*
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> 정보: GMS View Change Received for group TestGroup : Members in view
> for (before change analysis) are :
> 1: MemberId: 517a0864-1a11-4474-b40a-aff1ce79c12e, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEAE124E2EAD144600A358C3E2A889E9F203
> 2: MemberId: 347f676f-5300-4f40-a409-7ee2d7882c14, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC059DA8E0EC34A5696DB9C670EAB932503
>
> 2008. 6. 30 오후 1:23:06
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> 정보: Analyzing new membership snapshot received as part of event :
> ADD_EVENT
> ****JoinNotification received: GroupLeader = true,
> Signal.getMemberToken() = 347f676f-5300-4f40-a409-7ee2d7882c14,
> ServerName = 517a0864-1a11-4474-b40a-aff1ce79c12e, Leader =
> 517a0864-1a11-4474-b40a-aff1ce79c12e
> -------------------------------------------------------------------
> *
>
>
> [Server2]
> -------------------------------------------------------------------
> 2008. 6. 30 오후 1:23:04
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Starting SimpleJoinTest....
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.shoal.jointest.SimpleJoinTest initializeGMS
> 정보: Initializing Shoal for member:
> 347f676f-5300-4f40-a409-7ee2d7882c14 group:TestGroup
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Registering for group event notifications
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.shoal.jointest.SimpleJoinTest runSimpleSample
> 정보: Joining Group TestGroup
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> 정보: GMS View Change Received for group TestGroup : Members in view
> for (before change analysis) are :
> 1: MemberId: 347f676f-5300-4f40-a409-7ee2d7882c14, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC059DA8E0EC34A5696DB9C670EAB932503
>
> 2008. 6. 30 오후 1:23:05
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> 정보: Analyzing new membership snapshot received as part of event :
> MASTER_CHANGE_EVENT
> ****JoinNotification received: GroupLeader = false,
> Signal.getMemberToken() = 347f676f-5300-4f40-a409-7ee2d7882c14,
> ServerName = 347f676f-5300-4f40-a409-7ee2d7882c14, Leader =
> 347f676f-5300-4f40-a409-7ee2d7882c14
> *2008. 6. 30 오후 1:23:06
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> 정보: GMS View Change Received for group TestGroup : Members in view
> for (before change analysis) are :
> 1: MemberId: 517a0864-1a11-4474-b40a-aff1ce79c12e, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEAE124E2EAD144600A358C3E2A889E9F203
> 2: MemberId: 347f676f-5300-4f40-a409-7ee2d7882c14, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC059DA8E0EC34A5696DB9C670EAB932503
>
> 2008. 6. 30 오후 1:23:06
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> 정보: Analyzing new membership snapshot received as part of event :
> MASTER_CHANGE_EVENT
> ****JoinNotification received: GroupLeader = false,
> Signal.getMemberToken() = 517a0864-1a11-4474-b40a-aff1ce79c12e,
> ServerName = 347f676f-5300-4f40-a409-7ee2d7882c14, Leader =
> 517a0864-1a11-4474-b40a-aff1ce79c12e
> *2008. 6. 30 오후 1:23:06
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
> 정보: GMS View Change Received for group TestGroup : Members in view
> for (before change analysis) are :
> 1: MemberId: 517a0864-1a11-4474-b40a-aff1ce79c12e, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEAE124E2EAD144600A358C3E2A889E9F203
> 2: MemberId: 347f676f-5300-4f40-a409-7ee2d7882c14, MemberType: CORE,
> Address:
> urn:jxta:uuid-0836778E36C54F728D5B934A965395CEC059DA8E0EC34A5696DB9C670EAB932503
>
> 2008. 6. 30 오후 1:23:06
> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
> 정보: Analyzing new membership snapshot received as part of event :
> ADD_EVENT
> ****JoinNotification received: GroupLeader = false,
> Signal.getMemberToken() = 347f676f-5300-4f40-a409-7ee2d7882c14,
> ServerName = 347f676f-5300-4f40-a409-7ee2d7882c14, Leader =
> 517a0864-1a11-4474-b40a-aff1ce79c12e*
> -------------------------------------------------------------------
>
> The problem is the following.
>
> Sever1 and Server2 will receive own join notifications by current
> changes which Shreedhar edited.
> But after they receive own join notifications from first
> MASTER_CHANGE_EVENT , Server1 will send Server2 ADD_EVENT because
> Server1 is the master and is aware of Server2 newly.
> Then Server2 will receive ADD_EVENT from Server1 which is the master
> and notify duplicate join notification.
>
> So we can know that though any member receives ADD_EVENT from master,
> the member doesn't have to notify join notification without checking
> view history's changes. The member should always notify join
> notification by view histroy's changes like MASTER_CHANGE_EVENT.
>
> Or Shreedhar's current patch should be applied to only master.
>
> I expect that maybe Mike will obtain similar result. :-)
>
> Thanks.
>
> --
> Bongjae Chang
>
>
>
> ----- Original Message -----
> *From:* Shreedhar Ganapathy <mailto:Shreedhar.Ganapathy_at_Sun.COM>
> *To:* users_at_shoal.dev.java.net <mailto:users_at_shoal.dev.java.net>
> *Sent:* Saturday, June 28, 2008 6:42 AM
> *Subject:* Re: [Shoal-Users] Still not sure it's working
>
> Hi Mike
> I have checked in a fix to ViewWindow where some wierd logic
> earlier was excluding the first view change from being notified.
> Could you try with the latest sources?
>
> Thanks
> Shreedhar
>
> Shreedhar Ganapathy wrote:
>> Thanks for bringing this up. We may have a bug in our internal
>> test harness which should be reporting these issues and hence we
>> have not seen it yet. Let me check.
>> Could you meantime file an issue in the tracker?
>> I know Bongjae Chang had made some observations on this a while
>> ago and his patch is awaiting review and integration. That might
>> have a fix.
>>
>> Thanks
>> Shreedhar
>>
>> Mike Wannamaker wrote:
>>>
>>> I believe it’s a bug then. In both accounts. My SERVER-1 only
>>> gets a SERVER-1 message and SERVER-2 only gets a SERVER-1 message.
>>>
>>> So when I say it only gets a SERVER-1 message I mean that the method
>>>
>>> public void processNotification(Signal p_Signal)
>>>
>>> is only being called with that message.
>>>
>>> So if I start just SERVER-1, I see the GMS View Changed message,
>>> with just the SERVER-1 in it, but my processNotification(…) is
>>> not called. Not until I start SERVER-2 does it get called.
>>>
>>> On SERVER-2, I see the original GMS View Changed with just
>>> SERVER-2, and then GMS View Changed with SERVER-1,SERVER-2, but
>>> only get one processNotification(…) call.
>>>
>>> I will investigate further next week, but if you could have a
>>> look that would be great. Is no one else seeing this?
>>>
>>> This is my processNotification() method
>>>
>>> public void processNotification(Signal p_Signal)
>>>
>>> {
>>>
>>> try
>>>
>>> {
>>>
>>> p_Signal.acquire();
>>>
>>> SignalLogger log = new
>>> SignalLogger(p_Signal);
>>>
>>> log.logIt();
>>>
>>> if(p_Signal instanceof
>>> MessageSignal)
>>>
>>> {
>>>
>>> MessageSignal
>>> msgSig = (MessageSignal)p_Signal;
>>>
>>> String sMember =
>>> msgSig.getMemberToken();
>>>
>>> Object o =
>>> ObjectUtil.toObject(msgSig.getMessage());
>>>
>>> if(o instanceof
>>> SMessage)
>>>
>>> {
>>>
>>>
>>> SMessage smsg = (SMessage)o;
>>>
>>>
>>> InetAddress sender = m_hmMembers.get(sMember).address;
>>>
>>>
>>> smsg.setSender(sender);
>>>
>>>
>>> SMessageLogger.log.systemInfo(getClass(), "FireMessage: " + smsg);
>>>
>>>
>>> fireMessageReceived(smsg);
>>>
>>>
>>> //fireMessageReceived(smsg);
>>>
>>> }
>>>
>>> else
>>>
>>> {
>>>
>>>
>>> SMessageLogger.log.systemInfo(getClass(), "Message is NOT
>>> SMessage??");
>>>
>>> }
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> JoinNotificationSignal)
>>>
>>> {
>>>
>>>
>>> JoinNotificationSignal joinSig = (JoinNotificationSignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> JoinedAndReadyNotificationSignal)
>>>
>>> {
>>>
>>>
>>> JoinedAndReadyNotificationSignal joinSig =
>>> (JoinedAndReadyNotificationSignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> FailureSuspectedSignal)
>>>
>>> {
>>>
>>>
>>> FailureSuspectedSignal suspectSig =
>>> (FailureSuspectedSignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> FailureRecoverySignal)
>>>
>>> {
>>>
>>>
>>> FailureRecoverySignal failureSig = (FailureRecoverySignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> FailureNotificationSignal)
>>>
>>> {
>>>
>>>
>>> FailureNotificationSignal failureSig =
>>> (FailureNotificationSignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else if(p_Signal instanceof
>>> PlannedShutdownSignal)
>>>
>>> {
>>>
>>>
>>> PlannedShutdownSignal shutdownSig = (PlannedShutdownSignal)p_Signal;
>>>
>>>
>>> processClusterNotification();
>>>
>>> }
>>>
>>> else
>>>
>>> {
>>>
>>>
>>> SMessageLogger.log.debug(getClass(), "Received Notification of
>>> type : " + p_Signal.getClass().getName() + " Server: " +
>>> p_Signal.getMemberToken());
>>>
>>> }
>>>
>>> }
>>>
>>> catch(SignalAcquireException e)
>>>
>>> {
>>>
>>>
>>> SMessageLogger.log.fatal(getClass(), "Exception occured while
>>> acquiring signal", e);
>>>
>>> }
>>>
>>> finally
>>>
>>> {
>>>
>>> try
>>>
>>> {
>>>
>>> p_Signal.release();
>>>
>>> }
>>>
>>> catch(SignalReleaseException e)
>>>
>>> {
>>>
>>>
>>> SMessageLogger.log.warn(getClass(), "Exception occured while
>>> releasing signal" , e);
>>>
>>> }
>>>
>>> }
>>>
>>> }
>>>
>>> ------------------------------------------------------------------------
>>>
>>> *From:* Shreedhar.Ganapathy_at_Sun.COM
>>> [mailto:Shreedhar.Ganapathy_at_Sun.COM]
>>> *Sent:* June 27, 2008 11:21 AM
>>> *To:* users_at_shoal.dev.java.net
>>> *Subject:* Re: [Shoal-Users] Still not sure it's working
>>>
>>> Hi Mike
>>> The expected behavior is that as each server starts, its
>>> registered GMS client components will be notified of the
>>> server's own joining the group and any subsequent joins of other
>>> members.
>>> So in essence, server-1 GMS clients should see a
>>> JoinNotificationSignal for server-1, and another for server-2
>>> and in server-2, GMS clients should see a JoinNotificationSignal
>>> for server-2 and another for server-1.
>>> The order here does not matter but correctness is important and
>>> if not its a bug to be fixed.
>>>
>>> In the log below, Server-1 seems to be getting its own
>>> JoinNotificationSignal which is correct. Does it ever get the
>>> JoinNotificationSignal for server-2?
>>> On server-2, I am seeing correct behavior.
>>>
>>> (Ignore the log statements that show the view contents, as that
>>> is an event coming from the provider implementation - GMS
>>> notification signals are the ones that GMS clients should look
>>> in for correctness).
>>>
>>> Let me know.
>>> Thanks
>>> Shreedhar
>>>
>>> Mike Wannamaker wrote:
>>>
>>> Hi Guys,
>>>
>>> I’m still not sure it’s working as it’s supposed to? But maybe
>>> it is?
>>>
>>> Start SERVER-1
>>>
>>> Start SERVER-2
>>>
>>> On SERVER-1 I get a JoinMessage but it is from SERVER-1?
>>>
>>> On SERVER-2 I get a Join Message from SERVER-1,
>>> which is what I would expect?
>>>
>>> Is this correct? This depends on when the two servers are
>>> started. If I wait for a period between startups I get SERVER-2
>>> startup message on SERVER-1 and SERVER-1 startup message on
>>> SERVER-2. But if I start them both at the same time I get the
>>> above behaviour?
>>>
>>> Starting both at virtually the same time I get …
>>>
>>> SEVER-1 Output:
>>>
>>> 27-Jun-2008 12:40:29 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>>>
>>> INFO: GMS View Change Received for group RCS_CLUSTER : Members
>>> in view for (before change analysis) are :
>>>
>>> 1: MemberId: SERVER-1, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>>>
>>> 27-Jun-2008 12:40:29 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>>>
>>> INFO: Analyzing new membership snapshot received as part of
>>> event : MASTER_CHANGE_EVENT
>>>
>>> 27-Jun-2008 12:40:38 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>>>
>>> INFO: GMS View Change Received for group RCS_CLUSTER : Members
>>> in view for (before change analysis) are :
>>>
>>> 1: MemberId: SERVER-2, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>>>
>>> 2: MemberId: SERVER-1, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>>>
>>> 27-Jun-2008 12:40:38 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>>>
>>> INFO: Analyzing new membership snapshot received as part of
>>> event : ADD_EVENT
>>>
>>> 27-Jun-2008 12:40:38 AM DEBUG [pool-1-thread-1]
>>> com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - -
>>> SERVER-1 >> JoinNotificationSignalImpl @ 27/06/08 12:40 AM -
>>> [RCS_CLUSTER]:
>>> (Hashtable:[(String:server.name)<-->(String:SERVER-1),
>>> (String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])
>>>
>>> Server-2 Output
>>>
>>> 27-Jun-2008 12:40:30 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>>>
>>> INFO: GMS View Change Received for group RCS_CLUSTER : Members
>>> in view for (before change analysis) are :
>>>
>>> 1: MemberId: SERVER-2, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>>>
>>> 27-Jun-2008 12:40:30 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>>>
>>> INFO: Analyzing new membership snapshot received as part of
>>> event : MASTER_CHANGE_EVENT
>>>
>>> 27-Jun-2008 12:40:38 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow getMemberTokens
>>>
>>> INFO: GMS View Change Received for group RCS_CLUSTER : Members
>>> in view for (before change analysis) are :
>>>
>>> 1: MemberId: SERVER-2, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>>>
>>> 2: MemberId: SERVER-1, MemberType: CORE, Address:
>>> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>>>
>>> 27-Jun-2008 12:40:38 AM
>>> com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
>>>
>>> INFO: Analyzing new membership snapshot received as part of
>>> event : ADD_EVENT
>>>
>>> 27-Jun-2008 12:40:44 AM DEBUG [pool-1-thread-1]
>>> com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - -
>>> SERVER-1 >> JoinNotificationSignalImpl @ 27/06/08 12:40 AM -
>>> [RCS_CLUSTER]:
>>> (Hashtable:[(String:server.name)<-->(String:SERVER-1),
>>> (String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])
>>>