users@shoal.java.net

Re: [Shoal-Users] Re: send message after join

From: Jerry Raj <jerryr_at_sun.com>
Date: Mon, 22 Jun 2009 10:18:12 +0530

Joseph Fialli wrote:
> Bongjae Chang wrote:
>> Hi,
>>
>> Joe wrote:
>>
>>> It is not that join is not complete for the sender, the sender is
>>> definitely part of the group when join returns.
>>>
> The answer below assumes that Jerry is using the sendMessage that
> broadcasts a message to all instances of the
> gms group.
> Bongjae's comments reminded me that the send message to all instances in
> the cluster is no longer a broadcast in Shoal 1.5 as
> it was in Shoal 1.0. So to answer Jerry's original question, that is
> what has changed between Shoal 1.0 and Shoal 1.5 that he
> is noticing.
>
> GroupHandle.sendMessage(String targetComponent, byte[]) is going down a
> different code path in Shoal 1.5 that
> would be impacted by the comments that Bongjae has mentioned below.
> Instead of broadcasting the message
> depending solely on udp broadcast, the message is being sent to each
> instance one at a time (over TCP) based on
> occurring in clusterview manager view. It takes much longer for an
> instance to get into the clusterview than
> it does for a GMS client to "join" a multicast group. (which is the
> question I originally answered to
> Jerry. ) I had overlooked the new broadcast mechanism added to shoal
> 1.5 to increase reliablity of
> of broadcasting.
>
> The following is a workaround to get the Shoal 1.0 behavior for sending
> messages.
> (which is just a multicast broadcast.)
>
> Use GroupHandle.sendMessage(String destinationName, String
> targetComponent, byte[] msg).
> Calling /gh.sendMessage((String)null,
> "SubstituteYourTargetComponentName", msg)
> /will result in a udp broadcast rather than going down the code path
> that was introduced in
> Shoal 1.5 for broadcasting to cluster.
>
> I have verified by altering the MultiThreadedMessageSender test that
> this workaround works.
> Calling the first sendMessage() mentioned in this email, the first 928
> messages do not get sent
> between the two MultiThreadSendMessage test. Switching to the
> workaround, all messages get
> sent and I commented out the sleeps that were in MultiThreadSendMessage.
>
> Please confirm that this addresses your issue. We will write up a shoal
> message sending test to
> capture this behavior and work on documenting this better.


Many thanks for everyone's help. This works perfectly. I simply changed the call
from
gh.sendMessage(topic, msg);
to
gh.sendMessage((String)null, topic, msg);

Now the message is reliably received by existing peers.

Is this the recommended way, though? Because, if I understand correctly, the
reason to go from UDP multicast to individual TCP sends is to increase
reliability? Are we losing some reliability be going back to UDP?

I guess what we really need is either a callback or a method in gms to figure
out whether the clusterview formation is complete?

Thanks
-Jerry


>
> -Joe Fialli
>
>>
>> I have a little different opinion.
>>
>> I think that the discovery should also be finished for completing join.
>> "Join" logic includes a lot of initializations which include
>> MasterNode and ClusterViewManager's init.
>>
>> When gms.join() is called, MasterNode's init will be executed in
>> separate thread and MasterNode's init starts ClusterViewManager.
>>
>> ClusterViewManager could be related to GroupHandler#sendMessage().
>>
>> What do you think, Joe?
>>
>> And Jerry,
>>
>>
>>>> Do you see a view change with the two members in it ?
>>>>
>>> Yes, I do.
>>>
>>
>> Though you could see a view change with the two members, if you send a
>> message at that time, I think that the message could be lost. I am not
>> sure.
>>
>> But maybe I think you set the shoal logger to be Level.FINE, the
>> problem's reason could be found.
>>
>> And at first, you would like to check whether join is complete or not.
>>
>> Sometimes I used the following trick for this.
>> GroupManagementService#reportJoinedAndReadyState() will be returned
>> after the discovery time.
>>
>> So I am curious to know the following test's result.
>>
>> <snip>
>> gms.join();
>>
>> gms.reportJoinedAndReadyState();
>>
>> // Commented: Thread.sleep(10000);
>> GroupHandle gh = gms.getGroupHandle();
>> gh.sendMessage(blah);
>> </snip>
>>
>> Could you please test it again?
>>
>> Thanks.
>>
>> --
>> Bongjae Chang
>>
>>
>> ----- Original Message ----- From: "Jerry Raj" <jerryr_at_sun.com>
>> To: <users_at_shoal.dev.java.net>
>> Sent: Friday, June 19, 2009 7:46 PM
>> Subject: Re: [Shoal-Users] Re: send message after join
>>
>>
>>
>>> Shreedhar Ganapathy wrote:
>>>
>>>>>> Do you see a view change with the two members in it ?
>>>>>>
>>>>> Yes, I do.
>>>>>
>>>>>> If your send() is around the same time as the time the view change
>>>>>> happens then you will have message loss.
>>>>>>
>>>>> Ah. This must be it. Since the join and send happen immediately one
>>>>> after the
>>>>> other, its quite likely that the send happens while the existing
>>>>> peers are
>>>>> processing the view change.
>>>>>
>>>>>
>>>>>> Also, with Shoal 1.1 we have made a lot of synchronization
>>>>>> improvements
>>>>>> for correctness, which may have cost somewhat at the time of
>>>>>> startup of
>>>>>> members.
>>>>>>
>>>>>> Could you try out the code snippet that Joe provided as a way to
>>>>>> ensure
>>>>>> that message sending happens only when requisite memberships in the
>>>>>> group are in place ?
>>>>>>
>>>>> I have no pre-existing knowledge of the number of expected peers.
>>>>> So waiting for
>>>>> "all" of them to join has no real meaning for me. The idea here is
>>>>> that peers
>>>>> can join and leave as they please. The message they send as soon as
>>>>> they join is
>>>>> used by existing peers (if any) to gain certain data about the new
>>>>> peer (its IP
>>>>> address, port its listening on etc). So the code from below will
>>>>> not really work
>>>>> for me, since I have no idea if any more peers will join or not.
>>>>>
>>>>> I can live with having a sleep between join and send for now,
>>>>> except it seems
>>>>> rather non-deterministic: are we always sure that 5 secs is enough?
>>>>> Under load
>>>>> will this go up? A clear callback or signal that says "It is now
>>>>> safe to send
>>>>> messages" will be much better.
>>>>>
>>>> Timing is always a challenge when there are distributed systems
>>>> involved
>>>> and larger the number of members, the more involved it gets to ensure
>>>> virtual synchrony especially on asynchronous systems. We could
>>>> eventually look at an ack based system for membership lifecycle events'
>>>> view change messages, but there are costs with that as well when
>>>> memberships are large.
>>>>
>>>> One suggestion that may make it a bit better is that you could use the
>>>> joined and ready construct for letting the group know that you are
>>>> ready
>>>> to receive and send messages. i.e when a member has connected to the
>>>> group you get a join notification signal. After this when the member is
>>>> ready to send messages, it can call the reportJoinedAndReadyState() API
>>>> from the GroupManagementService reference
>>>> http://fisheye5.cenqua.com/browse/~raw,r=1.16/shoal/gms/src/java/com/sun/enterprise/ee/cms/core/GroupManagementService.java
>>>>
>>>>
>>>> Members who have registered for receiving the joined and ready
>>>> notification signal and have joined the group would get that
>>>> notification and then can send out messages to that member.
>>>>
>>>> Hope this helps you get a bit closer to deterministic knowledge of when
>>>> a member(or set of members is/are ready to receive messages.
>>>>
>>> This is the opposite of my problem. In my case, I have two nodes, and
>>> the
>>> following is the order of execution in chronological order:
>>>
>>> Node1:
>>> t0: join()
>>> t1: send() --> goes nowhere, as expected
>>>
>>> Node2:
>>> t2: join()
>>> t3: send() --> goes nowhere, not expected
>>>
>>> In this case Node1 can register for a joined and ready notification,
>>> but it will
>>> not help, since node1 is not going to send anything.
>>>
>>> -Jerry
>>>
>>>
>>>
>>>
>>>>> -Jerry
>>>>>
>>>>>
>>>>>> Thanks
>>>>>> Shreedhar
>>>>>>
>>>>>>
>>>>>>> The same identical code worked fine with Shoal 1.0.
>>>>>>>
>>>>>>> I hope this clarifies the use-case.
>>>>>>> -Jerry
>>>>>>>
>>>>>>>
>>>>>>>> However, there is not enough information in your original post to
>>>>>>>> confirm or deny this.
>>>>>>>>
>>>>>>>> You could delay sending a message to the group until there is
>>>>>>>> certain
>>>>>>>> number of members joined or
>>>>>>>> you could wait for all members to have joined via
>>>>>>>> JoinNotification event.
>>>>>>>>
>>>>>>>> Pull API:
>>>>>>>>
>>>>>>>> List<String> members = gms.getGroupHandle().getAllCurrentMembers();
>>>>>>>> <wait to send first message until all expected number of members
>>>>>>>> have
>>>>>>>> joined>
>>>>>>>>
>>>>>>>> Event driven API:
>>>>>>>>
>>>>>>>> gms.addActionFactory( new JoinNotificationActionFactoryImpl( new
>>>>>>>> JoinNotificationCallBack( serverName ) ) );
>>>>>>>> gms.join();
>>>>>>>> <wait till all expected instances have joined before sending
>>>>>>>> message;
>>>>>>>> use info calculated from JoinNotificationCallback>
>>>>>>>>
>>>>>>>> private class JoinNotificationCallBack implements CallBack {
>>>>>>>>
>>>>>>>> private String serverName;
>>>>>>>>
>>>>>>>> public JoinNotificationCallBack( String serverName ) {
>>>>>>>> this.serverName = serverName;
>>>>>>>> }
>>>>>>>>
>>>>>>>> // called for every instance joining the gms group.
>>>>>>>> public void processNotification( Signal notification ) {
>>>>>>>> <record instance has joined>;
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> A non-coding way to check this out is to start all your receiving
>>>>>>>> clients first.
>>>>>>>> Wait 10 seconds (like your initial test).
>>>>>>>> Then start your sending gms client.
>>>>>>>> There is no need a sleep between the join and send since all the
>>>>>>>> other
>>>>>>>> members
>>>>>>>> will have already joined. Hope this helps.
>>>>>>>>
>>>>>>>> -Joe
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> -Jerry
>>>>>>>>>
>>>>>>>>> Jerry Raj wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>> I have code like this:
>>>>>>>>>> <snip>
>>>>>>>>>> gms.join();
>>>>>>>>>> // Commented: Thread.sleep(10000);
>>>>>>>>>> GroupHandle gh = gms.getGroupHandle();
>>>>>>>>>>
>>>>>>>>>> gh.sendMessage(blah);
>>>>>>>>>>
>>>>>>>>>> </snip>
>>>>>>>>>>
>>>>>>>>>> This used to work fine in Shoal 1.0. The node would join the
>>>>>>>>>> group
>>>>>>>>>> and the
>>>>>>>>>> message would be recd by other members in the group. But this
>>>>>>>>>> does
>>>>>>>>>> not happen
>>>>>>>>>> with Shoal 1.1 unless I uncomment the sleep(10000) between
>>>>>>>>>> join() and
>>>>>>>>>> send(). I
>>>>>>>>>> expect this is because the join operation has not completed
>>>>>>>>>> successfully when
>>>>>>>>>> send() is called. Is there a way to be notified when join is
>>>>>>>>>> complete? I tried
>>>>>>>>>> looking at JoinedAndReadyNotificationActionImpl but that does not
>>>>>>>>>> seem to work?
>>>>>>>>>>
>>>>>>>>>> I'm using Shoal 1.1 from the download link on the front page
>>>>>>>>>> of the
>>>>>>>>>> Shoal website.
>>>>>>>>>>
>>>>>>>>>> -Jerry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>>
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>>>
>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>>>> <mailto:users-unsubscribe_at_shoal.dev.java.net>
>>>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>>> <mailto:users-help_at_shoal.dev.java.net>
>>>>>
>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
>>> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>>>
>>>
>>>
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: users-help_at_shoal.dev.java.net
>