users@shoal.java.net

Re: sendMessage is blocked for few minutes after some node leave cluster

From: Joseph Fialli <joe.fialli_at_oracle.com>
Date: Fri, 17 Jun 2011 16:15:09 -0400

  Stankovic,

I have a recommendation that will workaround the problem that you are
hitting.

Instead of using the following GroupHandle method to send to all instances.

GroupHandle.sendMessage(String componentName, final byte[] message)

Use the following invocation:

GroupHandle.sendMessage(String targetServerToken, String componentName,
final byte[] message).


Thus, in your existing code, there is a call that is similar to following:

GroupHandle gf = ...;
gf.sendMessage("yourApplicationsComponentName", yourAppMsg);

Change that to following:

gf.sendMessage(null, "yourApplicationComponentName", yourAppMsg);


You will be using UDP multicast to broadcast the monitor message to all
members of the GMS group.
The simulated broadcast is mistakenly trying to send to the instance
that left and your code is
waiting for a timeout.

You are using an old version of Shoal GMS that was implemented on top of
jxta and was used in
GlassFish 2.x. We do recommend that you use the newer version 1.5.29
that is used in GlassFish 3.1 and works over Grizzly 1.9.
You can download this newer version from following location:

http://shoal.java.net/downloadsindex.html

-Joe Fialli, Oracle Corporation

On 6/17/11 5:44 AM, Stankovic Miroslav wrote:
> Hi,
>
> here are some jstacks before and after node leaves cluster
>
> Best regards
>
> On Thu, Jun 16, 2011 at 9:25 PM, Shreedhar Ganapathy
> <shreedhar.ganapathy_at_oracle.com
> <mailto:shreedhar.ganapathy_at_oracle.com>> wrote:
>
>
>
> On 6/16/11 7:42 AM, smiki975_at_gmail.com <mailto:smiki975_at_gmail.com>
> wrote:
>
> I have cluster where nodes communicate using shoal GMS.
> Everything works fine (multicast work) until some node leave
> cluster (i
> just kill process).
> I have some monitoring system which need to send one shoal
> message each
> 5 sec. When some node leave group and view going to be changed,
> sendMessage method become blocked and it take 3-5 minutes .
> If i keep
> sending message, each call to sendMessage method get blocked.
> After 3-5 minutes all works again.
>
> My application is running under Centos Linux, using Jetty server
>
> Does anybody can tell me why this is happens?
>
> There could be a number of reasons. Could you take a jstack before
> a node leaves and a couple of jstacks after the node leaves? Space
> then a few minutes apart.
>
>
> Is it regular behavior for shoal or i need to do something to
> fix this
> long pause in sending messages?
>
> This is not expected behavior - but a number of possibilities
> could cause this including bugs. :-)
> Send us the info and we will be happy to take a look-see.
>
> Shreedhar
>
>