users@shoal.java.net

Re: sendMessage is blocked for few minutes after some node leave cluster

From: Shreedhar Ganapathy <shreedhar.ganapathy_at_oracle.com>
Date: Thu, 16 Jun 2011 12:25:08 -0700

On 6/16/11 7:42 AM, smiki975_at_gmail.com wrote:
> I have cluster where nodes communicate using shoal GMS.
> Everything works fine (multicast work) until some node leave cluster (i
> just kill process).
> I have some monitoring system which need to send one shoal message each
> 5 sec. When some node leave group and view going to be changed,
> sendMessage method become blocked and it take 3-5 minutes . If i keep
> sending message, each call to sendMessage method get blocked.
> After 3-5 minutes all works again.
>
> My application is running under Centos Linux, using Jetty server
>
> Does anybody can tell me why this is happens?
There could be a number of reasons. Could you take a jstack before a
node leaves and a couple of jstacks after the node leaves? Space then a
few minutes apart.

> Is it regular behavior for shoal or i need to do something to fix this
> long pause in sending messages?
This is not expected behavior - but a number of possibilities could
cause this including bugs. :-)
Send us the info and we will be happy to take a look-see.

Shreedhar