Re: [Shoal-Dev] Stale views in master node

From: Cameron Rochester <cameron_at_harvestroad.com.au>
Date: Tue, 05 Jan 2010 08:43:03 +0800

Joseph,

Thanks for the reply - I appreciate your time and look forward to
further comments.

One thing to point out: It is quite possible that a peer has failed
during startup. I am running a cluster with anywhere up to 20 peer so
there are plenty of opportunities for nodes to die on startup.

Cheers
Cameron

Joseph Fialli wrote:
> Cameron,
>
> Thanks for reporting this issue that you have encountered with Shoal.
>
> I have some very preliminary comments inlined below to help explain your
> questions.
> Will follow up with after more detailed analysis in another email.
>
> -Joe
>
>
> Cameron Rochester wrote:
>> Hi all,
>>
>> I have been using Shoal for some time now and have discovered an issue
>> that crops up every now and then.
>>
>> Basically, when doing a DSC update to all peers I am seeing the
>> following exception:
>>
>> WARNING: ClusterManager.send : sending of message
>> net.jxta.endpoint.Message_at_11882231(2){270} failed. Unable to create an
>> OutputPipe for
>> urn:jxta:uuid-59616261646162614A787461503250335FDDDB9470DA4390A3E692268159961303
>> route = null
>> java.io.IOException: Unable to create a messenger to
>> jxta://uuid-59616261646162614A787461503250335FDDDB9470DA4390A3E692268159961303/PipeService/urn:jxta:uuid-63B5938B46F147609C1C998286EA5F3B6E0638B5DF604AEEAC09A3FAE829FBE804
>>
>> at
>> net.jxta.impl.pipe.BlockingWireOutputPipe.checkMessenger(BlockingWireOutputPipe.java:238)
>>
>> at
>> net.jxta.impl.pipe.BlockingWireOutputPipe.<init>(BlockingWireOutputPipe.java:154)
>>
>> at
>> net.jxta.impl.pipe.BlockingWireOutputPipe.<init>(BlockingWireOutputPipe.java:135)
>>
>> at
>> net.jxta.impl.pipe.PipeServiceImpl.createOutputPipe(PipeServiceImpl.java:503)
>>
>> at
>> net.jxta.impl.pipe.PipeServiceImpl.createOutputPipe(PipeServiceImpl.java:435)
>>
>> at
>> net.jxta.impl.pipe.PipeServiceInterface.createOutputPipe(PipeServiceInterface.java:170)
>>
>> at
>> com.sun.enterprise.jxtamgmt.ClusterManager.send(ClusterManager.java:505)
>> at
>> com.sun.enterprise.ee.cms.impl.jxta.GroupCommunicationProviderImpl.sendMessage(GroupCommunicationProviderImpl.java:254)
>>
>> at
>> com.sun.enterprise.ee.cms.impl.jxta.DistributedStateCacheImpl.sendMessage(DistributedStateCacheImpl.java:500)
>>
>> at
>> com.sun.enterprise.ee.cms.impl.jxta.DistributedStateCacheImpl.addToRemoteCache(DistributedStateCacheImpl.java:234)
>>
>>
>> This lead me to the HealthMonitor and ClusterViewManager and I found
>> the following things:
>>
>> 1) The HealthMonitor does not seem to get a list of advertisements
>> from the ClusterViewManager to monitor. As far as I can tell they are
>> built up via heartbeat messages.
> That is by design. The ClusterViewManager list is built by receiving a
> ClusterViewEvent from the master.
> This is a subtle point when beginning to look at Shoal code but a very
> important one that I believe you may be overlooking.
> The HealthMonitor responsibilities in Master are quite different than
> HealthMonitor responsibilities in all other non-master members of cluster.
>
> Only the master's healthmonitor makes decisions on which members are
> ALIVE, SUSPECT or DEAD.
> All other Healthmonitors for non-master members of cluster are watching
> the MASTER to check if it is still ALIVE, otherwise,
>
> There is always a delay between when an instance fails and it is
> detected by Shoal master.
> (The default shoal configuration for heartbeat failure detection of a
> member takes very roughly between 7-9 seconds.)
> Failures like the one you report above can happen during that period.
> (WARNING message is too strong for those cases.)
> Start up all members of a GMS group and then kill one with a "kill -9".
> If you do the kill during startup, good chance you
> may see above failure and it could be explained. However, since that is
> not the scenario you are reporting, more thought is
> needed to explore this issue.
>
>> 2) Occasionally the master node can hold onto a stale advertisement.
>> When a new client receives the list of advertisements from the master
>> at start up I was seeing a node in the list (in STARTING state) that
>> didn't exist.
> As explained above, if a member fails during startup, you could
> observe the reported failure. However, since your scenario description
> does not describe such a failure,
> more investigation is needed.
>> 3) Once the master has a stale advertisement it never removes it (see
>> point 1)
> It is by design that an instance stays in healthmonitor with dead
> state. Only when Master updates cluster view with a FAILURE_EVENT
> should an instance
> be removed from non-Master member cluster view. (Of course, all
> non-Master members are monitors for master and will all act
> independently to detect masters
> failure and then all members will follow same algorithm to agree on new
> Master for GMS group.)
>> 4) This was then causing a problem (and long timeouts) when sending
>> the DSC update as it does a unicast to each advertisement, including
>> the failed one.
>>
>> I am not sure why the Master node has a stale reference, it only
>> happens occasionally, and is very hard to track down.
>>
>> To get around this I propose another fix. Basically the HealthMonitor
>> will compare the list of PeerIDs in it's cache, to the list of peers
>> known by the ClusterViewManager. If there are peers in the view that
>> are not in the HealthMonitor cache then I simply add them to the cache
>> so the InDoubtPeerDetector will do it's thing.
>>
>> The patch is attached. Could someone please review and let me know if
>> it makes sense? The main thing I am unsure about is the sequence ID.
>> It doesn't seem to be used by the in doubt detection so I have just
>> set it to 0.
> I will evaluate your proposal and get back to you. Thanks again for
> reporting this issue and your proposed solution.
>
>>
>> Thanks for looking
>> Cameron
>>
>> ------------------------------------------------------------------------
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>