Re: [Shoal-Dev] About HealthMonitor's cache

From: Bongjae Chang <carryel_at_korea.com>
Date: Wed, 8 Jul 2009 20:03:28 +0900

Hi Joe,

I understood your words and agree with you!

Thanks!
--
Bongjae Chang

----- Original Message -----
From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
To: <dev_at_shoal.dev.java.net>
Sent: Tuesday, July 07, 2009 1:21 AM
Subject: Re: [Shoal-Dev] About HealthMonitor's cache

> Bongjae Chang wrote:
>> Hi,
>> HealthMonitor stores the cache with members' states.
>> But if a member's state was stored once, the value would be never removed.
>> Assume that A was the group' member and now A is still failed.
>> Then, we can see the following FINE level's log continuously.
> Bongjae,
>
> I did not design the entry to stay in the cache, but I have been taking
> advantage of it recently.
> I will mention the two instances that I am aware of that benefit from it
> staying in the cache.
>
> It is my first impression that it is preferable to leave the the DEAD
> state cache in the HealthMonitor cache due to
> the existence of method GroupHandle.getMemberState(). getMemberState()
> is a pull API provided to GMS client to poll on state of a member. If
> there is no
> entry for a member in the cache, then GMS would need to try to contact
> the instance.
> If an application requests the state of a member and it has recently
> died, it is best to remember that state.
> If the instance restarts, the state will get replaced in the cache.
>
> The method HealthMonitor.cleanAllCaches() would be the place to clear
> entry, but I would prefer not to.
> Retaining the state ensures that we do not report an instance failed
> twice. The DEAD instance is cleared
> from all other caches that we want it to be cleaned from when the
> instance is dead by calling the method
> cleanAllCaches().
>
> I propose to fix the event log message and processing code in
> processCacheUpdate to skip entries for
> DEAD instances (and other states that do not make sense to process in
> that method.)
> However, even the WATCHDOG api benefits from the entry remaining in the
> healthcache since this
> provides a mapping from instance name within a group to the jxta entry
> id. The existence of a dead entry
> prevents WATCHDOG mechanism from not reporting an instance failed twice.
> There does exist a possible
> race condition between GMS heartbeat failure detection reporting an
> instance has failed and NA reporting an
> instance has failed, the current implementation relies on healthmonitor
> cache entry as central location to maintain
> state of an instance and prevent double reporting that an instance is DEAD.
>
> -Joe
>
>> --
>> [#|2009-07-03T21:42:45.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> [#|2009-07-03T21:42:48.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> [#|2009-07-03T21:42:51.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> [#|2009-07-03T21:42:54.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> [#|2009-07-03T21:42:57.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> [#|2009-07-03T21:43:00.945+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>> Thread for
>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>> cessCacheUpdate : A 's state is dead|#]
>> --
>> Is this expected for monitoring old member's state or members' history?
>> Please advice me.
>> Thanks.
>> --
>> Bongjae Chang
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>
>
>
>