users@glassfish.java.net

Re: FetchType.EAGER behaviour

From: <glassfish_at_javadesktop.org>
Date: Fri, 13 Jul 2007 04:54:57 PDT

The BIG problem is that the cache must get the objects from somewhere too. And getting objects with single selects is somewhat pointless. Remember, getting many objects at once will NOT remove the usage of cache, insteed it will improve the performance retrieving the objects to it.

Just to clarify with a small example:

em.createQuery("SELECT level1 FROM Level1 AS level1").getResultList();
will result in the following toplink output
[TopLink Fine]: 2007.07.13 01:14:04.782--ServerSession(24147539)--Connection(29931163)--Thread(Thread[http-8080-Processor24,5,main])--SELECT ID, NAME, LASTMODIFIEDDATE, LASTMODIFIEDBY_ID, CCREVISION_ID, REVISION, REVISIONDATE FROM LEVEL1
which is fine. Now we have every single Level1 object in the cache

em.createQuery("SELECT level2 FROM Level2 AS level2").getResultList();
will result in the following toplink output
[TopLink Fine]: 2007.07.13 01:14:07.586--ServerSession(24147539)--Connection(1692531)--Thread(Thread[http-8080-Processor24,5,main])--SELECT ID, NAME, LASTMODIFIEDDATE, LASTMODIFIEDBY_ID, LEVEL1_ID, REVISION, REVISIONDATE FROM LEVEL2
which is fine. Now we have every single Level2 object in the cache.

Level2 level2 = em.find(Level2.class, new Long(123));
will result in toplink retrieving the Level2 object from the cache.

Level1 level1 = level2.getParent();
will result in toplink retrieving the Level2 object from the cache.

Collection<Level2> level2s = level1.getChildren();
level2s.size();
will result in
[TopLink Fine]: 2007.07.13 01:14:31.468--ServerSession(24147539)--Connection(561973)--Thread(Thread[http-8080-Processor24,5,main])--SELECT ID, NAME, LASTMODIFIEDDATE, LASTMODIFIEDBY_ID, LEVEL1_ID, REVISION, REVISIONDATE FROM LEVEL2 WHERE (LEVEL1_ID = ?) bind => [123]
and this is the big trouble. For each level2 children list, another select will be executed and then further down on level4 another n selects will be executed.

This will result in 1+n selects per exra level which on a three level tree will be 1+n(1+m).
On a binary tree (2 nodes per level) we will have 1+2(1+2(1+2)) selects for three level tree and we can see that for an k level tree we will
have n^k selects, which is far beyond acceptable.

In this specific case with 3 levels with ~ 20nodes per level we get 30^3 = 27000 selects which takes approx 5minutes to execute versus 3 seconds for a single select with joins..

In other words, unless anyone can show how to do this in a different (correct?) way, any application having a tree structure should avoid toplink essentials and use any other JPA implementation which supports this e.g Hibernate.
[Message sent by forum member 'danielcroth' (danielcroth)]

http://forums.java.net/jive/thread.jspa?messageID=226549