RE: Configuration of Client vs. Shared Cache

From: Gordon Yorke <gordon.yorke_at_oracle.com>
Date: Wed, 23 Jan 2008 11:58:08 -0500

Hey Adam,
    1. Great
    2. Great
    3. Great
    4. Great
    5. I'll take a closer look...
    6. Great
    7. It will be.
    8. This Hashtable is used to store TopLink expressions for performance reasons. Should not be weak.

Thanks for the submission it looks great.

--Gordon

-----Original Message-----
From: Adam Bien [mailto:abien_at_adam-bien.com]
Sent: Tuesday, January 22, 2008 4:03 PM
To: Gordon Yorke
Cc: persistence_at_glassfish.dev.java.net
Subject: Re: Configuration of Client vs. Shared Cache

Hi Gordon,

I changed the code as you suggested.

  1. I renamed the property to: toplink.persistence.context.reference.mode
  2. In UnitOfWork the deleteObjects is untouched - it uses the "hard
     references"
  3. newAggregates uses now WeakKeyIdentityHashtable (with weak keys,
     and hard values)
  4. The referenceMode reference was changed from private to protected
  5. IdentityHashtable is abstract now. HardIdentityHashtable,
     WeakKeyIdentityHashtable and WeakKeyAndValueIdentityHashtable
     inherit from it. However, because each Hashtable accesses the
     Entries directly I just replicated the code to the subclasses.
     Each of the Hashtable accesses the Entry either directly , or
     using the WeakReference (then with accessors).
  6. newObjectsOriginalToClone is realized with weak keys and values.
  7. I just reused your implementation of the method buildIdentityMap
     in the IdentityMapManager in the hope it is enough :-)
  8. I'm not sure whether in the class InheritancePolicy always the
     Hashtable with the hard references has to be used:

public SQLSelectStatement
buildClassIndicatorSelectStatement(ObjectLevelReadQuery query) {

     // ...

       // 2612538 - the default size of HardIdentityHashtable (32) is
appropriate

//Hard, or potentially Weak?

       HardIdentityHashtable clonedExpressions = new
HardIdentityHashtable();

  9. Because IdentityHashtable is abstract now, I had to changed some
     additional classes to instantiate the HardIdentityHashtable instead.
10. I extended the basic test with some identity assertions, it seems
     to work so far.

Please let me know, whether the changes are o.k. - then I would like to
better document them.

regards,

adam

Gordon Yorke schrieb:
> Hello Adam,
> Honestly I took the easy route and changed only the two classes that took little effort. The UnitOfWork still needs to be updated based on my feedback as will the WeakIdentityHashtable class.
> --Gordon
>
> -----Original Message-----
> From: Adam Bien [mailto:abien_at_adam-bien.com]
> Sent: Wednesday, January 16, 2008 2:08 PM
> To: gordon.yorke_at_oracle.com
> Cc: persistence_at_glassfish.dev.java.net
> Subject: Re: Configuration of Client vs. Shared Cache
>
>
> Hi Yordon,
>
> thank you very much for your detailed feedback. As I understand, you
> already changed my code. Do you expect some actions from my side?
> I will submit the ETA to the V3 stream. What is the exact link?
>
> thank you,
>
> regards,
>
> adam
>
> Gordon Yorke schrieb:
>
>> Hello Adam,
>> I apologize for not getting back to you sooner. Generally
>> everything looks good but I have a few comments.
>> The UnitOfWork (Persistence Context) is more than a first level
>> cache and that should be reflected in the property name. Otherwise the
>> property may be used without an appreciation for the hard to detect
>> side-effect of change loss. The property should be moved to the
>> config package with the other properties as we will want customers to
>> be able to browse the javaDocs and see these properties as an option.
>> In the UnitOfWork the deletedObjects should not be touched as a
>> deleted object will probably be dereferenced by the client and may be
>> lost if garbage collected. The 'newAggregates' list should also
>> have weak references in the case the parent of an embeddable object is
>> released.
>> Private visibility is generally not used in TopLink code for class
>> attributes. Protected provides similar restrictions but does not
>> restrict extension through subclassing which has been quite useful in
>> the past.
>> With the IdentityHashtable it is the keys that should be weak in
>> most cases.with the newObjectsOriginalToClone both the key and the
>> value should be weak. Also for performance reasons it would be
>> better to have the weak IdentityHashtable be a subclass. That way
>> the Hashtable code will remain micro-optimized. At some point in V3
>> there is a plan to migrate to the JDK collections and having a
>> separate type will make that easier.
>> The IdentityMapManager.buildNewIdentityMap() will need to be
>> updated as well. Otherwise the 'cache' in the UnitOfWork will not
>> have week references.
>>
>> I have attached some changed classes with my feedback.
>>
>> There is still no ETA on when the entity-persistence module will be
>> available in the V3 stream. If you can submit these changes to the
>> EclipseLink bug I can get them integrated into the code base so they
>> do not get lost. ( https://bugs.eclipse.org/bugs/show_bug.cgi?id=214661 )
>> --Gordon
>>
>> Adam Bien wrote:
>>
>>> Patch for the problem below is submitted:
>>>
>>> https://glassfish.dev.java.net/issues/show_bug.cgi?id=3985
>>>
>>> Please review the enhancements. Thank you in advance!,
>>>
>>> regards,
>>>
>>> adam
>>>
>>> Gordon Yorke schrieb:
>>>
>>>> Hello Adam,
>>>> I can appreciate the usefulness of what you are requesting but
>>>> there is no means by which the provider can guarantee predictable
>>>> behaviour. The onus would be on the application developer to ensure
>>>> that the application forcefully keep the required objects from being
>>>> garbage collected and that no dependency was place on the garbage
>>>> collection for making objects un-managed (the transaction would
>>>> still have to complete successfully even if the garbage collector
>>>> was disabled).
>>>> I could see this functionality being added as an advanced feature
>>>> for users with specific needs but we would need to be careful in
>>>> presenting this feature without due warnings.
>>>> If you would like to attempt to produce this functionality within
>>>> TopLink I would recommend that you update the UnitOfWork's
>>>> collections "cloneMapping", "clonesToOriginals",
>>>> "newObjectsCloneToOriginal", "newObjectsOriginalToClone" and
>>>> "deletedObjects" to have weak references used within the collections
>>>> when a weak identity map is set on the UnitOfWork. Perhaps have the
>>>> member variables initialized to Maps that use only weak references.
>>>> If you have any questions on the internals of the UnitOfWork and its
>>>> place within the TopLink architecture I would be happy to help.
>>>> --Gordon
>>>>
>>>> Adam Bien wrote:
>>>>
>>>>> Hi Gordon,
>>>>> Gordon Yorke schrieb:
>>>>>
>>>>>> "even if the object is no longer referenced by the application" ->
>>>>>> why
>>>>>> this? In case an object isn't referenced by the application any
>>>>>> more, it
>>>>>> could simply disappear... Especially in an unmanaged environment.
>>>>>> This
>>>>>> extension would be really valuable for rich clients ("rias").
>>>>>> Actually
>>>>>> this problem will have every rich client application. How hard
>>>>>> would it
>>>>>> be to provide a switch to be able to choose between hard and weak
>>>>>> references?"
>>>>>>
>>>>>> Adam,
>>>>>> Having changed objects disappear from the PersistenceContext
>>>>>> based on weak references would be unpredictable. The application
>>>>>> would have no way of knowing what changes would be written out and
>>>>>> which ones would have been thrown away because of garbage
>>>>>> collection. In your case perhaps the application is structured in
>>>>>> such a way that only objects referenced by the application have
>>>>>> the potential to be changed but in many applications that is not
>>>>>> the case.
>>>>>>
>>>>> The approach with weak L1 cache should actually work for every
>>>>> application managed entity manager, which is not shared between
>>>>> threads and transactions.
>>>>> This is the case in most RCP (Netbeans, Eclipse) applications,
>>>>> where the Entity Manager is executed inside the same JVM as the
>>>>> presentation tier.
>>>>>
>>>>>
>>>>>> Particularly in service based applications the changes could be
>>>>>> applied through the merge() api without a reference ever existing
>>>>>> from the application to the managed entity.
>>>>>>
>>>>>>
>>>>> Absolutely. In service based applications you are right. Merge
>>>>> would be the option then. However in real service based application
>>>>> I would even use DTOs :-). However we are working with a rich
>>>>> domain model
>>>>>
>>>>>> Having a GUI use the merge() api to have only the changed
>>>>>> objects registered within the PersistenceContext and processed for
>>>>>> changes is a much more effecient and scalable architecture.
>>>>>>
>>>>> Yes. In our case it would be a huge disadvantage. Until now we can
>>>>> rely on the transparent persistence - which is great. Be forced to
>>>>> detach the objects and invoke "merge" isn't a elegant solution.
>>>>>
>>>>>> The complexity of the object model should not hamper the use of
>>>>>> the merge() api as the GUI's update events could merge the root of
>>>>>> the tree of Entities being changed and TopLink, with correct
>>>>>> cascade merge settings, would handle the rest.
>>>>>>
>>>>>>
>>>>> We have many UIs, Use Cases, reports etc. So there are many root
>>>>> objects in place. This makes your suggested approach more difficult...
>>>>>
>>>>> regards -
>>>>>
>>>>> adam
>>>>>
>>>>>> --Gordon
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Adam Bien [mailto:abien_at_adam-bien.com]
>>>>>> Sent: Monday, November 19, 2007 5:08 PM
>>>>>> To: persistence_at_glassfish.dev.java.net
>>>>>> Subject: Re: Configuration of Client vs. Shared Cache
>>>>>>
>>>>>>
>>>>>> Hello Gordon,
>>>>>>
>>>>>> thank you for the fast answer.
>>>>>> Gordon Yorke schrieb:
>>>>>>
>>>>>>
>>>>>>> Hello Adam,
>>>>>>> An application managed EntityManager always wraps an Extended
>>>>>>> Persistence Context. An Extended Persistence Context is not
>>>>>>> released until the EntityManager is closed unlike a transactional
>>>>>>> Container Managed Entity Manager which releases the Persistence
>>>>>>> Context at the end of each transaction.
>>>>>>>
>>>>>>> If you need you can simulate the transactional EntityManager by
>>>>>>> calling flush(), clear() before each commit() but this would
>>>>>>> still leave you with the detached issue.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> This is our issue. Detached entities are a real problem in more
>>>>>> complex
>>>>>> applications...
>>>>>>
>>>>>>
>>>>>>> The problem with holding the objects in a weak reference is the
>>>>>>> change detection requirement of the PersistenceContext. Any
>>>>>>> object touched by the PersistenceContext must be tracked by the
>>>>>>> PersistenceContext and any changes to that object must be written
>>>>>>> to the database on request, even if the object is no longer
>>>>>>> referenced by the application. With weak references unwritten
>>>>>>> changes could potentially be lost.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> "even if the object is no longer referenced by the application" ->
>>>>>> why
>>>>>> this? In case an object isn't referenced by the application any
>>>>>> more, it
>>>>>> could simply disappear... Especially in an unmanaged environment.
>>>>>> This
>>>>>> extension would be really valuable for rich clients ("rias").
>>>>>> Actually
>>>>>> this problem will have every rich client application. How hard
>>>>>> would it
>>>>>> be to provide a switch to be able to choose between hard and weak
>>>>>> references?
>>>>>>
>>>>>>
>>>>>>> There a multiple ways you could manage the volume of object
>>>>>>> registered within a Persistence Context. On screen changes you
>>>>>>> could close and create a new EntityManager but my recomended
>>>>>>> approach would be to detach the display objects from the managed
>>>>>>> objects. No need to create seperate classes but you could use
>>>>>>> the merge facility of the EntityManager to merge the objects from
>>>>>>> the GUI to the PersistenceContext for any object updates or
>>>>>>> specific save events from the GUI.
>>>>>>>
>>>>>>>
>>>>>> The problem here: the object graph is really complicated with
>>>>>> recursive
>>>>>> algorithms (about 700 entities...). Therefore we would rely on
>>>>>> "transparent persistence". Actually it works great with TopLink on
>>>>>> Smalltalk :-). We have only the issue in Java... However TopLink
>>>>>> is the
>>>>>> only OR-framework, which can handle this complexity. The only
>>>>>> issue are
>>>>>> the "hard" references.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> This would limit the number of objects in your PersistenceContext
>>>>>>> and mostlikly provide for a performance improvement from your
>>>>>>> current design as less objects are being checked for changes. I
>>>>>>> would also recommend that if you took this approach that you
>>>>>>> switch the shared cache to be a softcache weak identity map.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> We switched the shared cache already. The L1-cache is our problem...
>>>>>>
>>>>>> thank you for the detailed answer,
>>>>>>
>>>>>> adam
>>>>>>
>>>>>>
>>>>>>> --Gordon
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Adam Bien [mailto:abien_at_adam-bien.com]
>>>>>>> Sent: Monday, November 19, 2007 11:46 AM
>>>>>>> To: gordon.yorke_at_oracle.com
>>>>>>> Cc: persistence_at_glassfish.dev.java.net
>>>>>>> Subject: Re: Configuration of Client vs. Shared Cache
>>>>>>>
>>>>>>>
>>>>>>> Hi Gordon,
>>>>>>>
>>>>>>> thank you very much for the fast reply. I'm thinking of the
>>>>>>> Persistence
>>>>>>> Context as an Identity HashMap, which primary goal is to keep the
>>>>>>> consistency of the entities in a transaction.
>>>>>>> However in our application (it is a rich client, which connects
>>>>>>> directly
>>>>>>> to the database without an application server or middleware), it
>>>>>>> seems
>>>>>>> like once loaded objects are hold
>>>>>>> by the EntityManager until the method clear is invoked or the Entity
>>>>>>> Manager is closed (we are using the current build - TopLink
>>>>>>> v2b58). The
>>>>>>> Entity Manager remains open between the transactions - it is even
>>>>>>> not
>>>>>>> cleared. We would like to work with the entities, without loading
>>>>>>> them
>>>>>>> from the database every time.
>>>>>>>
>>>>>>> We are using data binding between the domain objects and the UI - so
>>>>>>> clearing the entity manager makes the entities detached (and
>>>>>>> makes the
>>>>>>> GC run) - the whole app has to be refreshed then. This in turn is
>>>>>>> not
>>>>>>> wished by the end users :-).
>>>>>>>
>>>>>>> I'm wondering, whether it would be possible to hold the objects
>>>>>>> using
>>>>>>> weak-reference in the L1 cache / transactional cache. Then all
>>>>>>> unneeded
>>>>>>> objects would be automatically garbage collected. The
>>>>>>> object-identity
>>>>>>> would be still provided with this setting... Now the memory
>>>>>>> consumption
>>>>>>> increases - until Out Of Memory occurs...
>>>>>>>
>>>>>>> any thoughts?
>>>>>>>
>>>>>>> thank you in advance,
>>>>>>>
>>>>>>>
>>>>>>> regards,
>>>>>>>
>>>>>>> adam
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Gordon Yorke schrieb:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hello Adam,
>>>>>>>> You may be confusing the Persistence Context with a cache?
>>>>>>>> Sometimes the UnitOfWork which is the implementation of the
>>>>>>>> Persistence Context in TopLink is described as a cache to
>>>>>>>> explain some
>>>>>>>> of its behaviours but its behaviour is beyond that of a cache.
>>>>>>>> Because the Unit0fWork must track all objects that were ever read
>>>>>>>> through or registered by the UnitOfWork to full its behavioural
>>>>>>>> requirements the TopLink cache settings do not apply.
>>>>>>>> --Gordon
>>>>>>>>
>>>>>>>>
>>>>>>>> Adam Bien wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> TopLink cache consists of two parts:
>>>>>>>>> 1. Client/L1
>>>>>>>>> 2. Shared/L2
>>>>>>>>>
>>>>>>>>> Is it possible to configure them independently? In the TopLink
>>>>>>>>> reference there are two sections:
>>>>>>>>>
>>>>>>>>> http://www.oracle.com/technology/products/ias/toplink/jpa/resources/toplink-jpa-extensions.html#BABGDJBC
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It seems like the @Cache annotation addresses the Client/L1 cache.
>>>>>>>>> The properties (
>>>>>>>>>
>>>>>>>>> eg. <property name="toplink.cache.type.Order" value="Full"/>)
>>>>>>>>> the L2.
>>>>>>>>>
>>>>>>>>> Is it possible to configure both caches using the persistence.xml
>>>>>>>>> properties?
>>>>>>>>>
>>>>>>>>> I would like to configure Weak Caches for both levels, however it
>>>>>>>>> seems like the L1 cache is still "Full" (or it uses "hard"
>>>>>>>>> references
>>>>>>>>> to entities)
>>>>>>>>>
>>>>>>>>> thank you in advance!,
>>>>>>>>>
>>>>>>>>> adam bien
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>> --
>>>>>>> Consultant, Author, Java Champion
>>>>>>>
>>>>>>> Homepage: www.adam-bien.com
>>>>>>> Weblog: blog.adam-bien.com
>>>>>>> eMail: abien_at_adam-bien.com
>>>>>>> Mobile: 0049(0)170 280 3144
>>>>>>>
>>>>>>> Books: Enterprise Architekturen (ISBN: 393504299X),
>>>>>>> Java EE 5 Architekturen (ISBN: 3939084247),
>>>>>>> J2EE Patterns, J2EE Hotspots, Enterprise Frameworks and
>>>>>>> Struts
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Consultant, Author, Java Champion
>>>>>>
>>>>>> Homepage: www.adam-bien.com
>>>>>> Weblog: blog.adam-bien.com
>>>>>> eMail: abien_at_adam-bien.com
>>>>>> Mobile: 0049(0)170 280 3144
>>>>>>
>>>>>> Books: Enterprise Architekturen (ISBN: 393504299X),
>>>>>> Java EE 5 Architekturen (ISBN: 3939084247),
>>>>>> J2EE Patterns, J2EE Hotspots, Enterprise Frameworks and
>>>>>> Struts
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>
>
>
>
>
>

--
Consultant, Author, Java Champion

Homepage: www.adam-bien.com
Weblog: blog.adam-bien.com
eMail: abien_at_adam-bien.com
Mobile: 0049(0)170 280 3144

Books: Enterprise Architekturen (ISBN: 393504299X),
        Java EE 5 Architekturen (ISBN: 3939084247),
        J2EE Patterns, J2EE Hotspots, Enterprise Frameworks and Struts