users@javaee-spec.java.net

[javaee-spec users] Re: JPA 2.1: Enhance per-query and per-property control over fetch eagerness, fetch mode, fetch groups

From: Linda DeMichiel <linda.demichiel_at_oracle.com>
Date: Mon, 25 Jun 2012 20:18:44 -0700

Hi Craig,

On 6/25/2012 7:45 PM, Craig Ringer wrote:
> To the JPA spec team and the broader EE working group:
>

Well, I'm at the intersection of the two :-)

> I've been seeing increasing evidence on user-facing forums and mailing lists that control over fetching via JPA is a
> real challenge for developers. It's certainly been a huge one for me. I'm interested in whether this can be improved for
> JPA 2.1 and Java EE 7, as in my view the fetch issues are a big pain point.
>
> I'm writing to raise this with the JPA 2.1 spec team, as I don't see any enhancements regarding fetch strategies and
> modes in the latest draft and didn't spot discussion of it on the list. I'd like to strike up a discussion about what,
> if anything, can/should be done about this for Java EE 7.
>

You are right -- we haven't gotten to it yet. However, it is on the agenda for the JPA 2.1 JSR and
is one of the items flagged in the proposal itself. I have been planning to put it next in the queue
after we finish with the area of multitenancy and schema generation. (There has been somewhat of a
hiatus in the latter discussion since I have been out of the country on vacation for a couple of weeks,
but it should resume shortly).

I hope you will continue to chime in the JPA list, and I would invite anyone else who is reading this to please
feel free to contribute on users_at_jpa-spec.java.net as well.

thanks,

-Linda


>
> What do apps need to do?
>
> In JPA 2.0 there's solid control over lazy vs eager fetching on an entity/property/relationship basis using the usual
> @...ToOne / @...ToMany (fetch=FetchType.[LAZY|EAGER]) annotations and the orm.xml equivalents. This works well, but is
> too simple for many projects.
>
> From my admittedly rather limited experience, and from discussions I've seen, it seems common to have widely referenced
> entities that you don't want to eagerly load the relationships of most of the time, but *need* them loaded in some
> situations. Commonly this is because you'll be using them detached from an entity manager context and know you'll need
> access to normally lazily loaded properties. Sometimes it's a performance issue where you can't afford the expense of
> lots of little database hits as proxied lazily loaded properties are loaded.
>
>
> What's currently possible?
>
> Right now, my understanding - and I don't claim it's a great one - is exactly one option to override normally lazy
> fetching with standard JPA: use a left join fetch, either in JPQL or via Criteria API. That's OK much of the time.
>
>
> What's wrong with the current situation?
>
> Being limited to a "left join fetch" can also be really problematic:
>
> * There's no way to ask the provider to use a different fetching strategy, like a follow-up batched SELECT, or use
> subselect fetching.
>
> * A left join fetch is fine when you're eagerly fetching one or two lazily fetched entity relationships. It scales
> extremely poorly if you have several things to fetch and/or more than one level, eg "a.b.c".
>
> * Apps sometimes need to do extra JPQL / criteria queries and repeat work in order to load required entities into
> the persistence context without expensive multiple joins.
>
> The key problem in my view is that the JPA API doesn't give the user any way to ask for normally-lazy relationships to
> be eagerly fetched without also forcing them to be fetched /in a single SQL query/. That can be really sub-optimal, and
> it conflates joins (a matter of query logic) with fetching (a matter of what's retrieved). You can't say "fetch x.y in
> whatever way is optimal".
>
> I've seen numerous recommendations, especially on the Vaadin lists and around Swing apps, to use EclipseLink and allow
> it to lazily load properties of detached entities using proxies. This is a /nasty/ thing for people to be relying on, as
> (a) each load is a query, so it's the ultimate in n+1 or worse with nested properties; and (b) those later loads are
> generally in new transactions, breaking the DB's consistency guarantees in ways optimistic locking often can't help
> with. That people are having to rely on this is IMO of concern.
>
> It doesn't help that the Root<T>.fetch(...) API is difficult to use correctly and has been acknowledged to be poorly
> specified. It's easy to land up doing a second unnecessary join, or to get a " query specified join fetching, but the
> owner of the fetched association was not present in the select list" error. This article used to talk about it:
>
> http://blogs.sun.com/ldemichiel/entry/jpa_next_thinking_about_the#comment-1291653518000
>
> but has since been devoured by the Oracle transition.
>
>
> What can be done via implementation-specific extensions?
>
> Some JPA implementations offer fetch controls via extensions, but there's nothing consistently available.
>
> EclipseLink gives quite good fetching control via JPA query hints, allowing default fetch modes to be overridden on a
> per-property basis and allowing the specification of alternative fetch strategies. It also supports lazy loading of
> properties in detached entities, which has several problems as mentioned above.
>
> Hibernate, as far as I've been able to determine, doesn't expose anything equivalent at the JPA level. It has
> setFetchMode(...) in its own Criteria API, but as far as I've been able to find out it doesn't expose that to JPA via
> hints or other mechanisms. I'm frequently told that Hibernate is best suited for short-transaction stateless
> applications because it doesn't lazy load on detached entities - presumably because it's too hard to specify what you
> want eagerly loaded.
>
> I'm not sufficiently familiar with other implementations to say what they offer.
>
>
> What's needed?
>
> In my view, the key thing is that JPA needs to do is provide join mode and strategy controls at a per-query,
> per-relationship level without requiring a left join fetch. I'd be interested in what your thoughts are.
>
>
> Per-query, per-property overrides for eager vs lazy fetching
>
> Clients need to be able to specify to the ORM that a given property should be eagerly or lazily fetched in a particular
> query. An API that avoids the need for providers to have to parse free-form properties (and is thus more checkable)
> would be good, so adding something like:
>
> CriteriaQuery.setFetchMode(String propertyName, FetchType fetchType)
>
> would seem ideal to me, where "propertyName" can be a dot-path to sub-properties, or of course a metamodel object/path.
>
> Different fetch strategies are supported by different implementations, and I don't think the JPA spec can really specify
> a complete set of possible strategies, so the fetch mode type should probably be a simple EAGER | LAZY enum, handily
> already provided by javax.persistence.FetchType . The implementer should be free to choose the most appropriate fetch
> method, so long as properties marked EAGER are in fact attached to the persistence context when the query completes.
>
>
> Per-query, per-property control over fetch strategies
>
> IMO if explicit specification of fetch strategy is provided though the JPA API (which would be nice) it should be by
> string names for strategies, or at least allow them. There's no predicting what fetch strategies will be possible. For
> example, with PostgreSQL's new JSON data type support it's possible to do an eager fetch of a relationship using a join
> or subquery with query_to_json, using array_agg and array_to_json, or using record_to_json. The ORM no longer needs to
> de-duplicate a cross product. Standardizing this would be nuts, but a way to ask an ORM that's aware of it to use it
> makes sense. I'd like to see something like:
>
> CriteriaQuery.setFetchStrategy(String propertyName, String strategy)
>
> ... and maybe ...
>
> CriteriaQuery.setFetchStategy(String propertyName, FetchStrategy strategy)
>
> ... with FetchStrategy being an enum { JOIN, SELECT, SUBSELECT, ANY } , as those are the widely recognised strategies
> plus one that lets the implementation choose (default for FetchType.EAGER).
>
>
> Fetch groups?
>
> It may also be worth thinking about another often-sought-after facility, fetch groups, but IMO control over fetch mode
> and strategy on a per-query, per-relationship level is much more important.
>
>
> BTW, I wrote a bit about this earlier here:
> http://blog.ringerc.id.au/2012/06/jpa2-is-very-inflexible-with-eagerlazy.html
>
> --
> Craig Ringer
>
> POST Newspapers 276 Onslow Rd, Shenton Park Ph: 08 9381 3088 Fax: 08 9388 2258 ABN: 50 008 917 717
> http://www.postnewspapers.com.au/