Glad you are looking into the code to determine internal workings. Let
me help explain a bit for more context on the DSC.
The DistributedStateCacheImpl's behavior is that of a synchronized
shared cache with the following process:
1. Each member joins the group. The first member to timeout of the
master discovery protocol in the service provider layer becomes the
group leader
2. The application can use the DSC to add data for in-memory persistence
once it joins a group. Until other members are part of the group view,
the data is local in the DSC.
3. As each member joins, cache synchronization occurs between that
member and the group leader whereby the leader of the group initiates a
syncDSC() (called from the leader's ViewWindow -
com.sun.enterprise.ee.cms.impl.jxta.ViewWindow), providing the leader's
version of the cache to that member. In response to this, the member's
DSC responds with its own cache which gets broadcast to the group. Every
write to the DSC after that point is shared group-wide.
4. For effectively using DSC as a shared cache, (client apps and) the
recovery server selection's Failure Fencing feature that uses the DSC ,
requires that each member has done this first sync with the group
leader. If it has not completed the first sync for whatever reason given
the vagaries of network, the fencing mechanism should not block forever,
and hence it resorts to a forcible sync under GroupHandleImpl's
forceDSCSync() method.
5. Failure Fencing only plays a part in the DSC sync when the first sync
operation that should have happened under ViewWindow did not happen or
complete properly. The failure fencing feature has association with the
Recovery Selection feature and does not provide fencing for the DSC itself.
See more below:
Bongjae Chang wrote:
> Hi Shreedhar Ganapathy
> Thanks for your reply.
> Now, I am simply reviewing Shoal code.Up to now, I don't have
> specialuse cases for only DSC.
> But,I hope that I use DSC for Web's distributedSession, Stateful
> Session Bean, JNDI cache, etc... in the future. :-)
> Return to the main, I thinkShoal used to utilize DSC for protective
> failure fencing operations.
> I think following Shoal's source code isa important thing about
> protective failure fencing operations.
> ---
> <In GroupHandleImpl.java>
> public void raiseFence(final String componentName, final String
> failedMemberToken)throws GMSException {
> if (!isFenced(componentName, failedMemberToken)) {
> ...
> // update informations
> dsc.addToCache(componentName,getGMSContext().getServerIdentityToken(),failedMemberToken,setStateAndTime());
> ...
> }
> }
> public boolean isFenced(final String componentName,final String
> memberToken) {
> ...
> while (members.size() > 1 && !dsc.isFirstSyncDone()) {
> logger.log(Level.FINE, "Waiting for DSC first Sync");
> try {
> Thread.sleep(SYNC_WAIT);
> count++;
> if (count > 4) {
> forceDSCSync((DistributedStateCacheImpl) dsc);
> }
> }
> catch (InterruptedException e) {
> logger.log(Level.WARNING, e.getLocalizedMessage());
> }
> }
> ...
> // get informations
> entries = dsc.getFromCache(memberToken);
> ...
> }
> ---
> I think dsc.isFirstSyncDone() is true when forceDSCSync() is called
> like above code and new group is joined.( is it right? )
I hope the process explanation above helped answer this question. The
forceDSCSync() is a repair operation that occurs, if the first sync
operation initiated from ViewWindow when a new member joined the group,
did not complete properly.
> If dsc.isFirstSyncDone() is false, the group leader(coordinator) will
> deliver DSC to all members after all. Then there is no problem.
That is incorrect. Referring to the process explained above, the
operations should be clearer.
> But if all members have true state from dsc.isFirstSyncDone() and
> raiseFence() method is called concurrently for each member,
> I have a doubt that some problem canbe occurred about protective
> failure fencing operations.
> ex) When one more members canexecute raiseFence()or isFenced()
> concurrently, DSC will be updated or returned by another informations.
> Then one more members will be able to execute recovery logic concurrently.
> So I had a question thatDSC guarantees data consistency and same view
> for same group concurrently.
I see what you mean. What you state above is that the raiseFence()
operation being public api can be called by multiple members for the
same failed member. This is true if the app does not follow the intended
usage of the API.
The API was provided as an avenue wherein members that wanted to perform
self recovery could raise a protective fence in the group so that other
members in the group check for existence of a fence and if so do not
perform recovery operations.
Currently, there is no check in the implementation in that method to
ensure that only self recovery ops can be performed therein. Perhaps we
should enforce it - we need to look at the ramifications of that. For
instance, the FailureRecoverySignal's acquire() method calls this API as
well for raising a fence from within a selected member. For more on the
recovery selection process, look in my blog entry :
http://blogs.sun.com/shreedhar/entry/glassfish_hidden_nugget_automatic_delegated
> I want to see the error of my ways. please point out mistakes.
These are good review comments which we very much welcome :) Keep 'em
coming.
If you see a bug, please file it in the issue tracker.
Thanks very much
Shreedhar
> thanks!
> --
> Bongjae Chang
>
> ----- Original Message -----
> *From:* Shreedhar Ganapathy <mailto:Shreedhar.Ganapathy_at_Sun.COM>
> *To:* dev_at_shoal.dev.java.net <mailto:dev_at_shoal.dev.java.net>
> *Sent:* Sunday, February 10, 2008 1:45 AM
> *Subject:* Re: [Shoal-Dev] Shoal DSC(distributed state cache)'s
> consistency question
>
> Hi Bongjae Chang
>
> Thanks for posting your questions. Please see inline for my
> responses.
>
> Thanks
> Shreedhar
>
> Bongjae Chang wrote:
>> Hi all. I'm a beginner in Shoal Project.
>> Recently, I have reviewed Shoal's source code because I am
>> interested in Generic Clustering Framework.
> Very glad to know of this. Please let us know of your use cases
> and we can surely help.
>> I know DSC usespoint-to-point communication(JXTA) with all
>> instances for reliablity.
> Up until now DSC implementation was using UDP broadcasts which are
> inherently unreliable. We are moving to make any message intended
> to be sent synchronously to use TCP based unicast for reliability
> and ordering for all messaging in Shoal with the current service
> provider. That is in the works and not yet complete. That should
> be happening next week.
>> following Shoal's source code isa message sending part in
>> GroupCommunicationProviderimpl.java
>> ---
>> public void sendMessage(...) {
>> ...
>> List<String> currentMembers =
>> getGMSContext().getGroupHandle().getAllCurrentMembers();
>> for (String currentMember : currentMembers) {
>> ...
>> clusterManager.send(id, message);
>> }
>> ...
>> }
>> ---
> The above code snippet is for applications that want to send
> messages to other members' application layer. For corresponding
> DSC send message snippet, look here:
> https://shoal.dev.java.net/source/browse/shoal/gms/src/java/com/sun/enterprise/ee/cms/impl/jxta/DistributedStateCacheImpl.java?rev=1.17&view=markup
>> The above, I have a question in DSC(distributed state cache,
>> default implementation)
>> DoesShoal DSC guarantee data consistency for same group?
> The answer is not a clear yes or no and is "it depends" as total
> consistency requires total ordering. The service provider impl
> does not provide total ordering as we have not seen a requirement
> for it. For DSC, at this point, the lightweight nature of its
> intended usage does not require total ordering and consistency
> provided by the transport layer has been sufficient so far in our
> tests.
>
> As Shoal's adoption increases, we are beginning to see
> requirements that would help us prioritize our efforts to provide
> such aspects.
>> for example,
>> Current members consist ofA, B and C.
>> WhenA send a message(key="a", value="1")tothe group,
>> B send another message(but same key, key="a", value="2")to the
>> group at the same time.
>> At this time, can the group memers(A, B, C)view same value about
>> key="a" in DSC? (all"1" or all"2" )
> There is definitely a small timing factor to be considered in any
> distributed system of this nature. Assuming the network is not a
> limiting factor, the data should be consistently available in the
> instances of DSC in each member. More below:
>> I think it doesn't matter aboutvalue(whether "1" or not), butit's
>> important whethergroup membershave same view concurrently or not.
> The members are expected to have identical membership view
> consistently.As a result, messages are delivered to all members in
> a current view.
>> And does DSC guarantee Atomicity?(All sent or not)
> The current DSC implementation is a simple shared concurrent
> hashmap. A write from one member gets disseminated to all members
> so that reads are local. A message sent is not transactional in
> that there is no notion of a 2-phase commit equivalent.
>
> We have not gone to the level of guaranteeing each of the ACID
> properties as yet.
>
> For instance, in your example above, with the current
> implementation, when members A and B write data to the same key,
> the last member writing data at that moment would overwrite the
> previous value in the group. That said, I would even venture to
> say that there is no code in the DSC implementation that would
> ensure that a single write blocks all other writes for the same key.
>
> Please let us know more details of what your requirements are and
> we will surely look into providing extensions for such a use case.
> We are currently in discussions on doing a full fledged ShoalCache
> and would like to open this up to the community for contributions
> both for discussions, input and for code.
>
>
>> If DSC guarantees consistency and atomicity, please explain the
>> mechanism briefly :-)
>> Orplease let me knowthe web link about this information.
>> I need your helps. Thanks!
> Would be glad to help. Keep sending us your questions, enhancement
> requests and any contributions you may have.
>> --
>> Bongjae Chang
>