users@shoal.java.net

Re: [Shoal-Users] Re: send message after join

From: Jerry Raj <jerryr_at_sun.com>
Date: Tue, 23 Jun 2009 10:39:19 +0530

Joseph Fialli wrote:
> Jerry Raj wrote:
>> Joseph Fialli wrote:
>>
>>> Bongjae Chang wrote:
>>>
>>>> Hi,
>>>>
>>>> Joe wrote:
>>>>
>>>>
>>>>> It is not that join is not complete for the sender, the sender is
>>>>> definitely part of the group when join returns.
>>>>>
>>> The answer below assumes that Jerry is using the sendMessage that
>>> broadcasts a message to all instances of the
>>> gms group.
>>> Bongjae's comments reminded me that the send message to all instances in
>>> the cluster is no longer a broadcast in Shoal 1.5 as
>>> it was in Shoal 1.0. So to answer Jerry's original question, that is
>>> what has changed between Shoal 1.0 and Shoal 1.5 that he
>>> is noticing.
>>>
>>> GroupHandle.sendMessage(String targetComponent, byte[]) is going down a
>>> different code path in Shoal 1.5 that
>>> would be impacted by the comments that Bongjae has mentioned below.
>>> Instead of broadcasting the message
>>> depending solely on udp broadcast, the message is being sent to each
>>> instance one at a time (over TCP) based on
>>> occurring in clusterview manager view. It takes much longer for an
>>> instance to get into the clusterview than
>>> it does for a GMS client to "join" a multicast group. (which is the
>>> question I originally answered to
>>> Jerry. ) I had overlooked the new broadcast mechanism added to shoal
>>> 1.5 to increase reliablity of
>>> of broadcasting.
>>>
>>> The following is a workaround to get the Shoal 1.0 behavior for sending
>>> messages.
>>> (which is just a multicast broadcast.)
>>>
>>> Use GroupHandle.sendMessage(String destinationName, String
>>> targetComponent, byte[] msg).
>>> Calling /gh.sendMessage((String)null,
>>> "SubstituteYourTargetComponentName", msg)
>>> /will result in a udp broadcast rather than going down the code path
>>> that was introduced in
>>> Shoal 1.5 for broadcasting to cluster.
>>>
>>> I have verified by altering the MultiThreadedMessageSender test that
>>> this workaround works.
>>> Calling the first sendMessage() mentioned in this email, the first 928
>>> messages do not get sent
>>> between the two MultiThreadSendMessage test. Switching to the
>>> workaround, all messages get
>>> sent and I commented out the sleeps that were in MultiThreadSendMessage.
>>>
>>> Please confirm that this addresses your issue. We will write up a shoal
>>> message sending test to
>>> capture this behavior and work on documenting this better.
>>>
>>
>>
>> Many thanks for everyone's help. This works perfectly. I simply
>> changed the call
>> from
>> gh.sendMessage(topic, msg);
>> to
>> gh.sendMessage((String)null, topic, msg);
>>
>> Now the message is reliably received by existing peers.
>>
>> Is this the recommended way, though? Because, if I understand
>> correctly, the
>> reason to go from UDP multicast to individual TCP sends is to increase
>> reliability? Are we losing some reliability be going back to UDP?
>>
>> I guess what we really need is either a callback or a method in gms to
>> figure
>> out whether the clusterview formation is complete?
>>
> Jerry,
>
> Glad to at least point out the difference from Shoal 1.0 to 1.5 for your
> send message call.
>
> I have worked in past on Java Message Service(JMS) and it is always
> challenging to get coordination
> between loosely coupled processes to broadcast a message and ensure all
> processes get the first message.
>
> You are correct that the switch from UDP to individual TCP sends was to
> increase reliability of message
> delivery. The workaround I provided was not planned, but just something
> I noted as a way to allow
> users to have a workaround to get the Shoal 1.0 behavior.
> I will put together a test case that shows how we propose users deal
> with the issue you are reporting.
> However, how does your system know when the clusterview formation is
> complete. I am pretty certain you
> mentioned in your last message that you don't know how many members are
> in the GMS group.

Right, members can come and go as they please. I simply need a deterministic way
that a new member can use to broadcast a piece of info to existing members, and
be sure that all existing members will receive it. I can live with some
uncertainty over members that joined at the same time or are in the process of
leaving, but those that joined a while ago and are not leaving should receive
the broadcast.

-Jerry

>
> All solutions that I had with synchronizing broadcasting messages in JMS
> included knowledge in the sender
> of how many other processes it was waiting for. The initial message
> sent out was not a content message, but
> a STARTUP message. When all the other processes replied back to the
> STARTUP message, the application knew
> that all loosely coupled processes all had their JMS subscriptions
> initialized and were ready to proceed.
>
>
> -Joe
>
>
>
>
>
>
>
>
>
>> Thanks
>> -Jerry
>>
>>
>>
>>> -Joe Fialli
>>>
>>>
>>>> I have a little different opinion.
>>>>
>>>> I think that the discovery should also be finished for completing join.
>>>> "Join" logic includes a lot of initializations which include
>>>> MasterNode and ClusterViewManager's init.
>>>>
>>>> When gms.join() is called, MasterNode's init will be executed in
>>>> separate thread and MasterNode's init starts ClusterViewManager.
>>>>
>>>> ClusterViewManager could be related to GroupHandler#sendMessage().
>>>>
>>>> What do you think, Joe?
>>>>
>>>> And Jerry,
>>>>
>>>>
>>>>
>>>>>> Do you see a view change with the two members in it ?
>>>>>>
>>>>> Yes, I do.
>>>>>
>>>> Though you could see a view change with the two members, if you send a
>>>> message at that time, I think that the message could be lost. I am not
>>>> sure.
>>>>
>>>> But maybe I think you set the shoal logger to be Level.FINE, the
>>>> problem's reason could be found.
>>>>
>>>> And at first, you would like to check whether join is complete or not.
>>>>
>>>> Sometimes I used the following trick for this.
>>>> GroupManagementService#reportJoinedAndReadyState() will be returned
>>>> after the discovery time.
>>>>
>>>> So I am curious to know the following test's result.
>>>>
>>>> <snip>
>>>> gms.join();
>>>>
>>>> gms.reportJoinedAndReadyState();
>>>>
>>>> // Commented: Thread.sleep(10000);
>>>> GroupHandle gh = gms.getGroupHandle();
>>>> gh.sendMessage(blah);
>>>> </snip>
>>>>
>>>> Could you please test it again?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Bongjae Chang
>>>>
>>>>
>>>> ----- Original Message ----- From: "Jerry Raj" <jerryr_at_sun.com>
>>>> To: <users_at_shoal.dev.java.net>
>>>> Sent: Friday, June 19, 2009 7:46 PM
>>>> Subject: Re: [Shoal-Users] Re: send message after join
>>>>
>>>>
>>>>
>>>>
>>>>> Shreedhar Ganapathy wrote:
>>>>>
>>>>>>>> Do you see a view change with the two members in it ?
>>>>>>>>
>>>>>>> Yes, I do.
>>>>>>>
>>>>>>>> If your send() is around the same time as the time the view change
>>>>>>>> happens then you will have message loss.
>>>>>>>>
>>>>>>> Ah. This must be it. Since the join and send happen immediately one
>>>>>>> after the
>>>>>>> other, its quite likely that the send happens while the existing
>>>>>>> peers are
>>>>>>> processing the view change.
>>>>>>>
>>>>>>>
>>>>>>>> Also, with Shoal 1.1 we have made a lot of synchronization
>>>>>>>> improvements
>>>>>>>> for correctness, which may have cost somewhat at the time of
>>>>>>>> startup of
>>>>>>>> members.
>>>>>>>>
>>>>>>>> Could you try out the code snippet that Joe provided as a way to
>>>>>>>> ensure
>>>>>>>> that message sending happens only when requisite memberships in
>>>>>>>> the
>>>>>>>> group are in place ?
>>>>>>>>
>>>>>>> I have no pre-existing knowledge of the number of expected peers.
>>>>>>> So waiting for
>>>>>>> "all" of them to join has no real meaning for me. The idea here is
>>>>>>> that peers
>>>>>>> can join and leave as they please. The message they send as soon as
>>>>>>> they join is
>>>>>>> used by existing peers (if any) to gain certain data about the new
>>>>>>> peer (its IP
>>>>>>> address, port its listening on etc). So the code from below will
>>>>>>> not really work
>>>>>>> for me, since I have no idea if any more peers will join or not.
>>>>>>>
>>>>>>> I can live with having a sleep between join and send for now,
>>>>>>> except it seems
>>>>>>> rather non-deterministic: are we always sure that 5 secs is enough?
>>>>>>> Under load
>>>>>>> will this go up? A clear callback or signal that says "It is now
>>>>>>> safe to send
>>>>>>> messages" will be much better.
>>>>>>>
>>>>>> Timing is always a challenge when there are distributed systems
>>>>>> involved
>>>>>> and larger the number of members, the more involved it gets to ensure
>>>>>> virtual synchrony especially on asynchronous systems. We could
>>>>>> eventually look at an ack based system for membership lifecycle
>>>>>> events'
>>>>>> view change messages, but there are costs with that as well when
>>>>>> memberships are large.
>>>>>>
>>>>>> One suggestion that may make it a bit better is that you could use
>>>>>> the
>>>>>> joined and ready construct for letting the group know that you are
>>>>>> ready
>>>>>> to receive and send messages. i.e when a member has connected to the
>>>>>> group you get a join notification signal. After this when the
>>>>>> member is
>>>>>> ready to send messages, it can call the
>>>>>> reportJoinedAndReadyState() API
>>>>>> from the GroupManagementService reference
>>>>>> http://fisheye5.cenqua.com/browse/~raw,r=1.16/shoal/gms/src/java/com/sun/enterprise/ee/cms/core/GroupManagementService.java
>>>>>>
>>>>>>
>>>>>>
>>>>>> Members who have registered for receiving the joined and ready
>>>>>> notification signal and have joined the group would get that
>>>>>> notification and then can send out messages to that member.
>>>>>>
>>>>>> Hope this helps you get a bit closer to deterministic knowledge of
>>>>>> when
>>>>>> a member(or set of members is/are ready to receive messages.
>>>>>>
>>>>> This is the opposite of my problem. In my case, I have two nodes, and
>>>>> the
>>>>> following is the order of execution in chronological order:
>>>>>
>>>>> Node1:
>>>>> t0: join()
>>>>> t1: send() --> goes nowhere, as expected
>>>>>
>>>>> Node2:
>>>>> t2: join()
>>>>> t3: send() --> goes nowhere, not expected
>>>>>
>>>>> In this case Node1 can register for a joined and ready notification,
>>>>> but it will
>>>>> not help, since node1 is not going to send anything.
>>>>>
>>>>> -Jerry
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>> -Jerry
>>>>>>>
>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Shreedhar
>>>>>>>>
>>>>>>>>
>>>>>>>>> The same identical code worked fine with Shoal 1.0.
>>>>>>>>>
>>>>>>>>> I hope this clarifies the use-case.
>>>>>>>>> -Jerry
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> However, there is not enough information in your original post to
>>>>>>>>>> confirm or deny this.
>>>>>>>>>>
>>>>>>>>>> You could delay sending a message to the group until there is
>>>>>>>>>> certain
>>>>>>>>>> number of members joined or
>>>>>>>>>> you could wait for all members to have joined via
>>>>>>>>>> JoinNotification event.
>>>>>>>>>>
>>>>>>>>>> Pull API:
>>>>>>>>>>
>>>>>>>>>> List<String> members =
>>>>>>>>>> gms.getGroupHandle().getAllCurrentMembers();
>>>>>>>>>> <wait to send first message until all expected number of members
>>>>>>>>>> have
>>>>>>>>>> joined>
>>>>>>>>>>
>>>>>>>>>> Event driven API:
>>>>>>>>>>
>>>>>>>>>> gms.addActionFactory( new JoinNotificationActionFactoryImpl(
>>>>>>>>>> new
>>>>>>>>>> JoinNotificationCallBack( serverName ) ) );
>>>>>>>>>> gms.join();
>>>>>>>>>> <wait till all expected instances have joined before sending
>>>>>>>>>> message;
>>>>>>>>>> use info calculated from JoinNotificationCallback>
>>>>>>>>>>
>>>>>>>>>> private class JoinNotificationCallBack implements CallBack {
>>>>>>>>>>
>>>>>>>>>> private String serverName;
>>>>>>>>>>
>>>>>>>>>> public JoinNotificationCallBack( String serverName ) {
>>>>>>>>>> this.serverName = serverName;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> // called for every instance joining the gms group.
>>>>>>>>>> public void processNotification( Signal notification ) {
>>>>>>>>>> <record instance has joined>;
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> A non-coding way to check this out is to start all your receiving
>>>>>>>>>> clients first.
>>>>>>>>>> Wait 10 seconds (like your initial test).
>>>>>>>>>> Then start your sending gms client.
>>>>>>>>>> There is no need a sleep between the join and send since all the
>>>>>>>>>> other
>>>>>>>>>> members
>>>>>>>>>> will have already joined. Hope this helps.
>>>>>>>>>>
>>>>>>>>>> -Joe
>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> -Jerry
>>>>>>>>>>>
>>>>>>>>>>> Jerry Raj wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>> I have code like this:
>>>>>>>>>>>> <snip>
>>>>>>>>>>>> gms.join();
>>>>>>>>>>>> // Commented: Thread.sleep(10000);
>>>>>>>>>>>> GroupHandle gh = gms.getGroupHandle();
>>>>>>>>>>>>
>>>>>>>>>>>> gh.sendMessage(blah);
>>>>>>>>>>>>
>>>>>>>>>>>> </snip>
>>>>>>>>>>>>
>>>>>>>>>>>> This used to work fine in Shoal 1.0. The node would join the
>>>>>>>>>>>> group
>>>>>>>>>>>> and the
>>>>>>>>>>>> message would be recd by other members in the group. But this
>>>>>>>>>>>> does
>>>>>>>>>>>> not happen
>>>>>>>>>>>> with Shoal 1.1 unless I uncomment the sleep(10000) between
>>>>>>>>>>>> join() and
>>>>>>>>>>>> send(). I
>>>>>>>>>>>> expect this is because the join operation has not completed
>>>>>>>>>>>> successfully when
>>>>>>>>>>>> send() is called. Is there a way to be notified when join is
>>>>>>>>>>>> complete? I tried
>>>>>>>>>>>> looking at JoinedAndReadyNotificationActionImpl but that
>>>>>>>>>>>> does not
>>>>>>>>>>>> seem to work?
>>>>>>>>>>>>
>>>>>>>>>>>> I'm using Shoal 1.1 from the download link on the front page
>>>>>>>>>>>> of the
>>>>>>>>>>>> Shoal website.
>>>>>>>>>>>>
>>>>>>>>>>>> -Jerry
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>
>>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>
>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>