users@shoal.java.net

Fast failover

From: Gary Fry <Gary.Fry_at_betfair.com>
Date: Thu, 28 Feb 2008 14:56:30 -0000

Hi there.

 

I have been playing with Shoal this week and I'm finding it a really
easy pick up and run with.

 

I have a specific problem I would like to solve, and that is lightning
fast failover detection (less than 100ms; less would be better) - for
reliable Leader Election. I've looked at the code and found that I can
pass in some properties via GMSFactory.startGMSModule The properties
I've found that are relevant are:

 

(JxtaConfigConstants): FAILURE_DETECTION_TIMEOUT,
FAILURE_DETECTION_RETRIES and FAILUIRE_VERIFICATION_TIMEOUT.

 

I've tried setting the values to low amounts, without sufficient
success. I have noticed that the HealhMonitor skews the failure
detection timeout by adding 500ms (in the FailureVerifier private inner
class of HealthMonitor).

 

When running a test app, I can't seem to get failover to occur within
less than about three seconds. Am I doing something wrong, or am I
simply trying to push Shoal/Jxta too much?

 

Thanks for your attention J

Gary Fry


________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan from
MessageLabs to scan all Incoming and Outgoing mail for viruses.

________________________________________________________________________