Well, I won't comment on how horrible the whole statement is, because I'm confident that it's an example to prove a point rather an idiom you're practicing in code. Everyone knows for large queries like that you should be selecting as much as you can in a single network hit through joins rather than making a gazillion network hits. Common sense.
For example:
[code]
Query q = em.createQuery("SELECT l FROM Loan l");
for (Loan l : q.getResultList()) {
....
}
[/code]
will run in a heartbeat compared to what you posted.
But, basically, em.getReference is essentially little different from em.find. They both basically do the same thing (save for how they handle the return values). To be specific, they do EXACTLY the same thing save how they handle the return value (find returns null, getReference throws an exception for missing items).
em.find() will check the first level cache from the persistence context, and if it can not find it there, it will check the cache of the second level cache which is maintained by the container. If it can not find it there, it will issue a call to the DB as there is nothing else it can do. I mean, seriously, what else would you expect it to do?
If the object by the primary key is within either the level 1 or level 2 cache, find will NOT hit the DB.
So, when you run your query, if all of those items will fit in memory, the first time you run it, it will suck (literally sucking the data in to the cache). But the second time, it will scream as everything is cached.
Anyway, bottom line, the EM is doing everything properly. It's a feature, not a bug. I consider the example you posted bad form for any data set of reasonable size.
[Message sent by forum member 'whartung' (whartung)]
http://forums.java.net/jive/thread.jspa?messageID=223614