Re: [Shoal-Dev] About HealthMonitor's cache

From: Bongjae Chang <carryel_at_korea.com>
Date: Thu, 9 Jul 2009 15:57:03 +0900

Hi Shreedhar,

I fixed them.

GroupLeadershipNotificationSignalImpl.java already had "package com.sun.enterprise.ee.cms.impl.common", so any callers had no effect.

Thanks!
--
Bongjae Chang

----- Original Message -----
From: "Shreedhar Ganapathy" <Shreedhar.Ganapathy_at_Sun.COM>
To: <dev_at_shoal.dev.java.net>
Sent: Thursday, July 09, 2009 3:17 PM
Subject: Re: [Shoal-Dev] About HealthMonitor's cache

> Hi Bongjae
>
> Bongjae Chang wrote:
>> Thanks Joe,
>>
>> While I merged your commit today into SHOAL_1_1_ABSTRACTING_TRANSPORT branch, I found that some java classes about the group leadership should be modified and committed again correctly.
>>
>> 1. GroupLeadershipNotificationSignalImpl.java should be moved into com.sun.enterprise.ee.cms.impl.common package. It is not com.sun.enterprise.ee.cms.impl.client package.
>>
> You are right. The Signal Impl classes are in common and this should go
> there as well. Could you do the needful while also changing any callers
> as well?
>> 2. GroupLeadershipNotificationActionImpl.java, GroupLeadershipNotificationTest.java, GroupLeadershipNotificationSignalImpl.java and GroupLeadershipNotificationActionFactory.java should have the license header.
>>
> Thanks for that as well.
>> Shall I correct them if you don't mind?
>>
> Please go ahead.
>> --
>> Bongjae Chang
>>
>> ----- Original Message -----
>> From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
>> To: <dev_at_shoal.dev.java.net>
>> Sent: Thursday, July 09, 2009 12:36 AM
>> Subject: Re: [Shoal-Dev] About HealthMonitor's cache
>>
>>
>>
>>> Bongjae Chang wrote:
>>>
>>>> Hi Joe,
>>>>
>>>> I understood your words and agree with you!
>>>>
>>>>
>>> Bongjae,
>>>
>>> My commit today in HealthMontor.java addressed the issue that you raised
>>> that the FINE processCacheUpdate log
>>> message was coming out for DEAD entries. It will only come up for
>>> entries that
>>> processCacheUpdate actually operates on.
>>>
>>> -Joe
>>>
>>>> Thanks!
>>>> --
>>>> Bongjae Chang
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: "Joseph Fialli" <Joseph.Fialli_at_Sun.COM>
>>>> To: <dev_at_shoal.dev.java.net>
>>>> Sent: Tuesday, July 07, 2009 1:21 AM
>>>> Subject: Re: [Shoal-Dev] About HealthMonitor's cache
>>>>
>>>>
>>>>
>>>>
>>>>> Bongjae Chang wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>> HealthMonitor stores the cache with members' states.
>>>>>> But if a member's state was stored once, the value would be never removed.
>>>>>> Assume that A was the group' member and now A is still failed.
>>>>>> Then, we can see the following FINE level's log continuously.
>>>>>>
>>>>>>
>>>>> Bongjae,
>>>>>
>>>>> I did not design the entry to stay in the cache, but I have been taking
>>>>> advantage of it recently.
>>>>> I will mention the two instances that I am aware of that benefit from it
>>>>> staying in the cache.
>>>>>
>>>>> It is my first impression that it is preferable to leave the the DEAD
>>>>> state cache in the HealthMonitor cache due to
>>>>> the existence of method GroupHandle.getMemberState(). getMemberState()
>>>>> is a pull API provided to GMS client to poll on state of a member. If
>>>>> there is no
>>>>> entry for a member in the cache, then GMS would need to try to contact
>>>>> the instance.
>>>>> If an application requests the state of a member and it has recently
>>>>> died, it is best to remember that state.
>>>>> If the instance restarts, the state will get replaced in the cache.
>>>>>
>>>>> The method HealthMonitor.cleanAllCaches() would be the place to clear
>>>>> entry, but I would prefer not to.
>>>>> Retaining the state ensures that we do not report an instance failed
>>>>> twice. The DEAD instance is cleared
>>>>> from all other caches that we want it to be cleaned from when the
>>>>> instance is dead by calling the method
>>>>> cleanAllCaches().
>>>>>
>>>>> I propose to fix the event log message and processing code in
>>>>> processCacheUpdate to skip entries for
>>>>> DEAD instances (and other states that do not make sense to process in
>>>>> that method.)
>>>>> However, even the WATCHDOG api benefits from the entry remaining in the
>>>>> healthcache since this
>>>>> provides a mapping from instance name within a group to the jxta entry
>>>>> id. The existence of a dead entry
>>>>> prevents WATCHDOG mechanism from not reporting an instance failed twice.
>>>>> There does exist a possible
>>>>> race condition between GMS heartbeat failure detection reporting an
>>>>> instance has failed and NA reporting an
>>>>> instance has failed, the current implementation relies on healthmonitor
>>>>> cache entry as central location to maintain
>>>>> state of an instance and prevent double reporting that an instance is DEAD.
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>>>>
>>>>>> --
>>>>>> [#|2009-07-03T21:42:45.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> [#|2009-07-03T21:42:48.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> [#|2009-07-03T21:42:51.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> [#|2009-07-03T21:42:54.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> [#|2009-07-03T21:42:57.930+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> [#|2009-07-03T21:43:00.945+0900|FINE|Shoal|ShoalLogger|_ThreadID=30;_ThreadName=InDoubtPeerDetector
>>>>>> Thread for
>>>>>> Group:test;ClassName=HealthMonitor$InDoubtPeerDetector;MethodName=processCacheUpdate;|pro
>>>>>> cessCacheUpdate : A 's state is dead|#]
>>>>>> --
>>>>>> Is this expected for monitoring old member's state or members' history?
>>>>>> Please advice me.
>>>>>> Thanks.
>>>>>> --
>>>>>> Bongjae Chang
>>>>>>
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>>>>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
>>> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>>>
>>>
>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>
>
>
>