users@shoal.java.net

RE: [Shoal-Users] Still not sure it's working

From: Mike Wannamaker <mwannama_at_opentext.com>
Date: Mon, 30 Jun 2008 14:27:34 -0400

Okay tested when shutting down a non groupleader. I do see suspect and
failure notifications.

 

However, you might not like this; I also see something that is very
strange and disturbing.

 

I start SERVER-1 (GROUPLEADER), SERVER-2, and SERVER-3.

 

Shutdown SERVER-3, get correct messages in SERVER-1 and mostly in
SERVER-2, but I also get a FailureSuspect for SERVER-1 in SERVER-2
window.

This might be okay if I got a notification that the node was back, but I
don't and it is still running. Started SERVER-3 and see SERVER-1 in the
list and it gets notifications as well.

 

I tried again shutdown the newly running SERVER-3 and I get the same
results so it seems fully reproducible.

 

 

 

Here is the output for SERVER-2

 

30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
getMemberTokens

INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
for (before change analysis) are :

1: MemberId: SERVER-2, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A1617
79845B03

2: MemberId: SERVER-3, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD54C54AB0D7A640E493A5C6CE42
7A3CE203

3: MemberId: SERVER-1, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC
8BFEC603

 

30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
newViewObserved

INFO: Analyzing new membership snapshot received as part of event :
IN_DOUBT_EVENT

30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
addInDoubtMemberSignals

INFO: gms.failureSuspectedEventReceived

30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.common.Router
notifyFailureSuspectedAction

INFO: Sending FailureSuspectedSignals to registered Actions.
Member:SERVER-3...

30-Jun-2008 02:16:57 PM DEBUG [pool-1-thread-4]
com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - - SERVER-3
>> FailureSuspectedSignalImpl @ 30/06/08 2:00 PM - [RCS_CLUSTER-false]:
(Hashtable:[(String:server.name)<-->(String:SERVER-3),
(String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])

MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89,
mwana0061/10.6.2.89])

30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
isConnected

INFO: Checking for machine status for network interface :
tcp://10.6.2.89:9701

30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
isConnected

INFO: Checking for machine status for network interface :
tcp://192.168.111.1:9701

30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
isConnected

INFO: Checking for machine status for network interface :
tcp://192.168.138.1:9701

30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
getMemberTokens

INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
for (before change analysis) are :

1: MemberId: SERVER-2, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A1617
79845B03

2: MemberId: SERVER-3, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD54C54AB0D7A640E493A5C6CE42
7A3CE203

3: MemberId: SERVER-1, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC
8BFEC603

 

30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
newViewObserved

INFO: Analyzing new membership snapshot received as part of event :
IN_DOUBT_EVENT

30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
addInDoubtMemberSignals

INFO: gms.failureSuspectedEventReceived

30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.common.Router
notifyFailureSuspectedAction

INFO: Sending FailureSuspectedSignals to registered Actions.
Member:SERVER-1...

30-Jun-2008 02:17:27 PM DEBUG [pool-1-thread-4]
com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - - SERVER-1
>> FailureSuspectedSignalImpl @ 30/06/08 1:59 PM - [RCS_CLUSTER-false]:
(Hashtable:[(String:server.name)<-->(String:SERVER-1),
(String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])

MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89,
mwana0061/10.6.2.89])

30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
getMemberTokens

INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
for (before change analysis) are :

1: MemberId: SERVER-2, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A1617
79845B03

2: MemberId: SERVER-1, MemberType: CORE, Address:
urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC
8BFEC603

 

30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
newViewObserved

INFO: Analyzing new membership snapshot received as part of event :
FAILURE_EVENT

30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
addFailureSignals

INFO: The following member has failed: SERVER-3

30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.common.Router
notifyFailureNotificationAction

INFO: Sending FailureNotificationSignals to registered Actions. Member:
SERVER-3...

30-Jun-2008 02:17:30 PM DEBUG [pool-1-thread-4]
com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - - SERVER-3
>> FailureNotificationSignalImpl @ 30/06/08 2:00 PM -
[RCS_CLUSTER-false]:
(Hashtable:[(String:server.name)<-->(String:SERVER-3),
(String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])SERVER-3

MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89])

 

________________________________

From: Shreedhar.Ganapathy_at_Sun.COM [mailto:Shreedhar.Ganapathy_at_Sun.COM]
Sent: June 30, 2008 2:07 PM
To: users_at_shoal.dev.java.net
Subject: Re: [Shoal-Users] Still not sure it's working

 

Thats correct. Yes I should not mix up the provider terminology versus
GMS terminology.
Thanks
Shreedhar

Mike Wannamaker wrote:

When you say a non-master do you mean when a server is shutdown that is
not the groupleader?

 

________________________________

From: Shreedhar.Ganapathy_at_Sun.COM [mailto:Shreedhar.Ganapathy_at_Sun.COM]
Sent: June 30, 2008 1:47 PM
To: users_at_shoal.dev.java.net
Subject: Re: [Shoal-Users] Still not sure it's working

 

Hi Mike
This is a recent known issue occuring when master failure occurs. I
don't see a Shoal issue on this yet but our QE has filed an internal
issue on this behavior. I will post an issue in the Shoal tracker later
today with your details.

Can you confirm if behavior is okay when a non-master member fails?

Thanks
Shreedhar