persistence@glassfish.java.net

Re: Some questions about HardWeak/SoftWeak cache

From: Gordon Yorke <gordon.yorke_at_oracle.com>
Date: Wed, 21 May 2008 08:58:27 -0400

Hello,
   I have provided some answers in-line.
--Gordon

jasonw41 wrote:
> I know the cache topic has been discussed a lot here, but after reading the
> related messages and Wonseok's blog(Thanks a lot.) I still have some
> questions about how HardWeak/SoftWeak caches work, so here is my story.
>
> I understand when I call em.persist(data) the data will be created and hold
> by persistence context until em closes. The question is: when does it go to
> shared cache? Does Toplink put the data into shared cache right after it is
> persisted?
>
TopLink merges the changes from the Persistence Context into the shared
cache once the transaction has committed successfully.
> From TopLink Essentials JPA Extensions Reference I know HardWeak contains
> two caches: LinkedList and HashMap. I was wondering when the LinkedList is
> full, how does Toplink use the two caches? Does it just add the newly cached
> data into the LinkedList and push out the top one into the HashMap? Or is
> there any algorithm to decide whether data goes to LinkedList or HashMap?
>
Actually the LinkedList and the HashMap should not be thought of as
separate caches. The HardWeak maintains a fixed size of hard references
guaranteeing that you will always have at least a configured number of
LRU entities in the cache. The HardWeak cache may also contain any
number of Entities are referenced by the application or that have been
referenced and have not yet garbage collected.
> Another question is about the HardWeak cahce. TopLink Reference says the
> LinkedList holds hard references for cahced data. Does it mean thoese cache
> will never be gc-ed even the system is short of memory until Toplink cleans
> them after every 100nth access?
>
Yes, TopLink will hold a configured number of Entities and not allow
them to garbage collect.
> The next question is how I can tell how many data have been cached in
> LinkedList or HashMap? I tried the following API:
> ((oracle.toplink.essentials.ejb.cmp3.EntityManager)em).getServerSession().getIdentityMapAccessor().printIdentityMaps()
> and it does give me the data list, but I am not sure if it is a proper way.
> By the way, above API works for java se. Could anybody tell me how to use it
> in an EJB when the em is obtained by annotation?
>
This really depends on what you mean by "tell how many data". If you
want to see the contents printed out using the toString() method of the
Entities then yes this is the correct way. If you are simply looking
for a number then getting the IdentityMap from the IdentityMapAccessor
for a particular class then call size will give you an approximate
number of objects cached. I say approximate because with the Weak
caches any unreferenced object could be garbage collected at any point.
> I tried another way to count the HardWeak cahce and the sample is changed
> from Wonseok's blog.
> - 0 - Set cahce type HardWeak and size 50
> - 1 - In first transaction, persist 100 data and close the em
> - 2 - Update all 100 data in database by JDBC
> - 3 - Open the second transaction call em.find to retrieve all the 100 data.
> Count all unchanged data which should be from the cache.
> But I am not sure if this is a valid test as well because of my first
> question: If Toplink puts new cahced data into LinkedList and the old one
> goes to HashMap which could be easily get gc-ed, then I will not get the
> correct number.
>
This test will show you that 50 of the items will not have the updates
that were performed through JDBC.
> Cheers,
> Jason
>