Re: [Shoal-Dev] About HealthMonitor's cache

From: Shreedhar Ganapathy <Shreedhar.Ganapathy_at_Sun.COM>
Date: Wed, 08 Jul 2009 23:17:39 -0700

Hi Bongjae

Bongjae Chang wrote:
> Thanks Joe,
>
> While I merged your commit today into SHOAL_1_1_ABSTRACTING_TRANSPORT branch, I found that some java classes about the group leadership should be modified and committed again correctly.
>
> 1. GroupLeadershipNotificationSignalImpl.java should be moved into com.sun.enterprise.ee.cms.impl.common package. It is not com.sun.enterprise.ee.cms.impl.client package.
>
You are right. The Signal Impl classes are in common and this should go
there as well. Could you do the needful while also changing any callers
as well?
> 2. GroupLeadershipNotificationActionImpl.java, GroupLeadershipNotificationTest.java, GroupLeadershipNotificationSignalImpl.java and GroupLeadershipNotificationActionFactory.java should have the license header.
>
Thanks for that as well.
> Shall I correct them if you don't mind?
>
Please go ahead.
> --
> Bongjae Chang
>
> ----- Original Message -----
> From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
> To: <dev_at_shoal.dev.java.net>
> Sent: Thursday, July 09, 2009 12:36 AM
> Subject: Re: [Shoal-Dev] About HealthMonitor's cache
>
>
>
>> Bongjae Chang wrote:
>>
>>> Hi Joe,
>>>
>>> I understood your words and agree with you!
>>>
>>>
>> Bongjae,
>>
>> My commit today in HealthMontor.java addressed the issue that you raised
>> that the FINE processCacheUpdate log
>> message was coming out for DEAD entries. It will only come up for
>> entries that
>> processCacheUpdate actually operates on.
>>
>> -Joe
>>
>>> Thanks!
>>> --
>>> Bongjae Chang
>>>
>>>
>>> ----- Original Message -----
>>> From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
>>> To: <dev_at_shoal.dev.java.net>
>>> Sent: Tuesday, July 07, 2009 1:21 AM
>>> Subject: Re: [Shoal-Dev] About HealthMonitor's cache
>>>
>>>
>>>
>>>
>>>> Bongjae Chang wrote:
>>>>
>>>>
>>>>> Hi,
>>>>> HealthMonitor stores the cache with members' states.
>>>>> But if a member's state was stored once, the value would be never removed.
>>>>> Assume that A was the group' member and now A is still failed.
>>>>> Then, we can see the following FINE level's log continuously.
>>>>>
>>>>>
>>>> Bongjae,
>>>>
>>>> I did not design the entry to stay in the cache, but I have been taking
>>>> advantage of it recently.
>>>> I will mention the two instances that I am aware of that benefit from it
>>>> staying in the cache.
>>>>
>>>> It is my first impression that it is preferable to leave the the DEAD
>>>> state cache in the HealthMonitor cache due to
>>>> the existence of method GroupHandle.getMemberState(). getMemberState()
>>>> is a pull API provided to GMS client to poll on state of a member. If
>>>> there is no
>>>> entry for a member in the cache, then GMS would need to try to contact
>>>> the instance.
>>>> If an application requests the state of a member and it has recently
>>>> died, it is best to remember that state.
>>>> If the instance restarts, the state will get replaced in the cache.
>>>>
>>>> The method HealthMonitor.cleanAllCaches() would be the place to clear
>>>> entry, but I would prefer not to.
>>>> Retaining the state ensures that we do not report an instance failed
>>>> twice. The DEAD instance is cleared
>>>> from all other caches that we want it to be cleaned from when the
>>>> instance is dead by calling the method
>>>> cleanAllCaches().
>>>>
>>>> I propose to fix the event log message and processing code in
>>>> processCacheUpdate to skip entries for
>>>> DEAD instances (and other states that do not make sense to process in
>>>> that method.)
>>>> However, even the WATCHDOG api benefits from the entry remaining in the
>>>> healthcache since this
>>>> provides a mapping from instance name within a group to the jxta entry
>>>> id. The existence of a dead entry
>>>> prevents WATCHDOG mechanism from not reporting an instance failed twice.
>>>> There does exist a possible
>>>> race condition between GMS heartbeat failure detection reporting an
>>>> instance has failed and NA reporting an
>>>> instance has failed, the current implementation relies on healthmonitor
>>>> cache entry as central location to maintain
>>>> state of an instance and prevent double reporting that an instance is DEAD.
>>>>
>>>> -Joe
>>>>
>>>>
>>>>
>>>>> --
>>>>> [#|2009-07-03T21:42:45.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> [#|2009-07-03T21:42:48.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> [#|2009-07-03T21:42:51.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> [#|2009-07-03T21:42:54.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> [#|2009-07-03T21:42:57.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> [#|2009-07-03T21:43:00.945+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>> Thread for
>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>> --
>>>>> Is this expected for monitoring old member's state or members' history?
>>>>> Please advice me.
>>>>> Thanks.
>>>>> --
>>>>> Bongjae Chang
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>>>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>>
>>
>>
>>
>>