Hi Mike
Yes this is indeed a new problem. I hope this is not different
snippets but a continuous log snippet. What seems strange in this pasted
output is that there is no failure suspected signal (in doubt event) for
Server-3 ? Is this what you see? There is the suspect event for server-1.
Some questions: Are all instances on the same machine? The interface
addresses dont seem to be all in the same subnet and/or it appears to be
different networks in a multihome machine environment (I see 10.6.2.89
and 192.168.111.1 and 192.168.138.1).
Are all instances started concurrently?
Do you have any antivirus or firewalls running in your machine(s) ? If
yes, can you disable them and see if communications and events happen
correctly?
Thanks
Shreedhar
Mike Wannamaker wrote:
>
> Okay tested when shutting down a non groupleader. I do see suspect
> and failure notifications.
>
>
>
> However, you might not like this; I also see something that is very
> strange and disturbing.
>
>
>
> I start SERVER-1 (GROUPLEADER), SERVER-2, and SERVER-3.
>
>
>
> Shutdown SERVER-3, get correct messages in SERVER-1 and mostly in
> SERVER-2, but I also get a FailureSuspect for SERVER-1 in SERVER-2 window.
>
> This might be okay if I got a notification that the node was back, but
> I don't and it is still running. Started SERVER-3 and see SERVER-1 in
> the list and it gets notifications as well.
>
>
>
> I tried again shutdown the newly running SERVER-3 and I get the same
> results so it seems fully reproducible.
>
>
>
>
>
>
>
> Here is the output for SERVER-2
>
>
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> getMemberTokens
>
> INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
> for (before change analysis) are :
>
> 1: MemberId: SERVER-2, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>
> 2: MemberId: SERVER-3, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD54C54AB0D7A640E493A5C6CE427A3CE203
>
> 3: MemberId: SERVER-1, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>
>
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> newViewObserved
>
> INFO: Analyzing new membership snapshot received as part of event :
> IN_DOUBT_EVENT
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> addInDoubtMemberSignals
>
> INFO: gms.failureSuspectedEventReceived
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.ee.cms.impl.common.Router
> notifyFailureSuspectedAction
>
> INFO: Sending FailureSuspectedSignals to registered Actions.
> Member:SERVER-3...
>
> 30-Jun-2008 02:16:57 PM DEBUG [pool-1-thread-4]
> com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - -
> SERVER-3 >> FailureSuspectedSignalImpl @ 30/06/08 2:00 PM -
> [RCS_CLUSTER-false]:
> (Hashtable:[(String:server.name)<-->(String:SERVER-3),
> (String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])
>
> MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89,
> mwana0061/10.6.2.89])
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
> isConnected
>
> INFO: Checking for machine status for network interface :
> tcp://10.6.2.89:9701
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
> isConnected
>
> INFO: Checking for machine status for network interface :
> tcp://192.168.111.1:9701
>
> 30-Jun-2008 2:16:57 PM com.sun.enterprise.jxtamgmt.HealthMonitor
> isConnected
>
> INFO: Checking for machine status for network interface :
> tcp://192.168.138.1:9701
>
> 30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> getMemberTokens
>
> INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
> for (before change analysis) are :
>
> 1: MemberId: SERVER-2, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>
> 2: MemberId: SERVER-3, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD54C54AB0D7A640E493A5C6CE427A3CE203
>
> 3: MemberId: SERVER-1, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>
>
>
> 30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> newViewObserved
>
> INFO: Analyzing new membership snapshot received as part of event :
> IN_DOUBT_EVENT
>
> 30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> addInDoubtMemberSignals
>
> INFO: gms.failureSuspectedEventReceived
>
> 30-Jun-2008 2:17:27 PM com.sun.enterprise.ee.cms.impl.common.Router
> notifyFailureSuspectedAction
>
> INFO: Sending FailureSuspectedSignals to registered Actions.
> Member:SERVER-1...
>
> 30-Jun-2008 02:17:27 PM DEBUG [pool-1-thread-4]
> com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - -
> SERVER-1 >> FailureSuspectedSignalImpl @ 30/06/08 1:59 PM -
> [RCS_CLUSTER-false]:
> (Hashtable:[(String:server.name)<-->(String:SERVER-1),
> (String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])
>
> MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89,
> mwana0061/10.6.2.89])
>
> 30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> getMemberTokens
>
> INFO: GMS View Change Received for group RCS_CLUSTER : Members in view
> for (before change analysis) are :
>
> 1: MemberId: SERVER-2, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FD0D4B867250FF460C9B539A161779845B03
>
> 2: MemberId: SERVER-1, MemberType: CORE, Address:
> urn:jxta:uuid-2F39FF376B6A43E3905DAFC81B7D02FDB946A28335F0413BBF73B77CCC8BFEC603
>
>
>
> 30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> newViewObserved
>
> INFO: Analyzing new membership snapshot received as part of event :
> FAILURE_EVENT
>
> 30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
> addFailureSignals
>
> INFO: The following member has failed: SERVER-3
>
> 30-Jun-2008 2:17:30 PM com.sun.enterprise.ee.cms.impl.common.Router
> notifyFailureNotificationAction
>
> INFO: Sending FailureNotificationSignals to registered Actions.
> Member: SERVER-3...
>
> 30-Jun-2008 02:17:30 PM DEBUG [pool-1-thread-4]
> com.opentext.ecm.services.smessage.impl.shoal.SignalLogger - -
> SERVER-3 >> FailureNotificationSignalImpl @ 30/06/08 2:00 PM -
> [RCS_CLUSTER-false]:
> (Hashtable:[(String:server.name)<-->(String:SERVER-3),
> (String:local.host)<-->(Inet4Address:mwana0061/10.6.2.89)])SERVER-3
>
> MEMBERS: (ArrayList:[mwana0061/10.6.2.89, mwana0061/10.6.2.89])
>
>
>
> ------------------------------------------------------------------------
>
> *From:* Shreedhar.Ganapathy_at_Sun.COM [mailto:Shreedhar.Ganapathy_at_Sun.COM]
> *Sent:* June 30, 2008 2:07 PM
> *To:* users_at_shoal.dev.java.net
> *Subject:* Re: [Shoal-Users] Still not sure it's working
>
>
>
> Thats correct. Yes I should not mix up the provider terminology versus
> GMS terminology.
> Thanks
> Shreedhar
>
> Mike Wannamaker wrote:
>
> When you say a non-master do you mean when a server is shutdown that
> is not the groupleader?
>
>
>
> ------------------------------------------------------------------------
>
> *From:* Shreedhar.Ganapathy_at_Sun.COM
> <mailto:Shreedhar.Ganapathy_at_Sun.COM> [mailto:Shreedhar.Ganapathy_at_Sun.COM]
> *Sent:* June 30, 2008 1:47 PM
> *To:* users_at_shoal.dev.java.net <mailto:users_at_shoal.dev.java.net>
> *Subject:* Re: [Shoal-Users] Still not sure it's working
>
>
>
> Hi Mike
> This is a recent known issue occuring when master failure occurs. I
> don't see a Shoal issue on this yet but our QE has filed an internal
> issue on this behavior. I will post an issue in the Shoal tracker
> later today with your details.
>
> Can you confirm if behavior is okay when a non-master member fails?
>
> Thanks
> Shreedhar
>
>
>
>