users@jpa-spec.java.net

[jpa-spec users] Re: JPA_SPEC-99: Add ability to stream the result of a query execution

From: Steve Ebersole <steve_at_hibernate.org>
Date: Fri, 05 May 2017 11:11:48 +0000

That is a good point about expectation. Users will absolutely assume that
the stream is doing "some magic" whereas it may or may not per-provider. I
think a note in the spec and/or javadoc should suffice. That or require
"scrollable results" behind the scenes :P. Do you think that is enough?

Relatedly, IIRC there was a Jira asking for addition of "scrollable
results". Is that on the table for this MR?

As to the method name, I do see both sides as well - I'd prefer the
consistent name (#stream), but I'd be fine with either. I happened to
choose #stream for Hibernate after getting community feedback, if that
helps.

On Fri, May 5, 2017, 1:59 AM Oliver Gierke <ogierke_at_pivotal.io> wrote:

> Hi all,
>
> I like Lukas suggestion as it falls back to standard list based access but
> allows providers to gradually add optimized, really streaming
> implementations (Hibernate already having an implementation in place). So I
> think it's a nice enablement feature, that prevents users from having to
> wait for quite some time to actually get that. Plus, it's very much in line
> with the overall Java 8 theme of the umbrella.
>
> In Spring Data we've already had support for Streams for quite a while.
> Starting with our own clunky version of a CloseableIterator over provider
> specific APIs, we're now looking into gradually removing that as we now
> upgrade our Hibernate baseline to a version that already includes the calls
> on native APIs. Getting such API on JPA types would allow us to remove *a
> lot* of custom code.
>
> Two concerns I'd like to raise though, effectively feedback we got from
> exposing those APIs:
>
> - When people call a method returning a Stream they seem to expect a lot
> of magic happening, i.e. a real stream of result elements. If the default
> implementation then just reads everything at once and basically calls
> List.stream(), some users might feel tricked. What I am trying to say is:
> the API promises a Stream. In the context of databases users then come with
> certain expectations and might be disappointed to hear they're not met by
> default. However, I totally see the appeal of being able to write code that
> uses streams and then gradually get better performance when a provider adds
> that feature.
>
> - It took us quite a bit of time and resources to educate users about the
> specialties that particular Stream instance needs to be handled with:
> required usage in try-with-resources, the need to keep the underlying
> EntityManager and DataSource connection open *while the stream is
> consumed*. Especially the latter is not a too obvious problem superficially
> and we ended up even adding some strict guards [0] into our execution code
> that interacts with Spring's TX infrastructure. I guess something like that
> is not an option in JavaEE.
>
> - As to stream() VS. getResultStream() — I am torn on this. I definitely
> see Steve's point here but one could also argue that JPA developers
> familiar with getResultList() would at least consider getResultStream()
> consistent within that API.
>
> Cheers,
> Ollie
>
> [0] https://jira.spring.io/browse/DATAJPA-1023
>
> > Am 05.05.2017 um 07:57 schrieb Guillermo González de Agüero <
> z06.guillermo_at_gmail.com>:
> >
> > Hi,
> >
> > Linda: in Hibernate it acts as a ScrollableResult that fetches data
> "on-demand". I imagine, in practice, every implementation will use
> something like that to provide performance benefits. This is a very
> valuable addition.
> >
> > Lukas: I think that JavaDoc could add an example of the possible things
> a vendor may do, like "persistence provider may choose to override this
> method to provide additional capabilities, *like cursor-like fetching*".
> That would make the purpose clearer.
> >
> >
> > Regards,
> >
> > Guillermo González de Agüero.
> >
> > On Thu, May 4, 2017 at 11:35 PM, Linda DeMichiel <
> linda.demichiel_at_oracle.com> wrote:
> > Hi,
> >
> > If the value is mostly in the provider-specific add-ons, then I
> > think it would be better to wait and standardize in a later
> > release.
> >
> > What did you have in mind here?
> >
> > -Linda
> >
> >
> > On 5/4/17, 1:01 PM, Lukas Jungmann wrote:
> > Hi,
> >
> > another change I'd like to propose is an addition of
> > getResultStream:Stream method to javax.persistence.Query/TypedQuery
> > interfaces with following javadoc and provided with default
> implementation:
> >
> > javax.persistence.Query:
> >
> > / /**//
> > // * Execute a SELECT query and return the query results//
> > // * as an untyped <code>java.util.Stream</code>.//
> > // * By default this method delegates to
> > <code>getResultList().stream()</code>,//
> > // * however persistence provider may choose to override this
> method//
> > // * to provide additional capabilities.//
> > // *//
> > // * @return a stream of the results//
> > // * @throws IllegalStateException if called for a Java//
> > // * Persistence query language UPDATE or DELETE statement//
> > // * @throws QueryTimeoutException if the query execution exceeds//
> > // * the query timeout value set and only the statement is//
> > // * rolled back//
> > // * @throws TransactionRequiredException if a lock mode other than//
> > // * <code>NONE</code> has been set and there is no
> > transaction//
> > // * or the persistence context has not been joined to the
> > transaction//
> > // * @throws PessimisticLockException if pessimistic locking//
> > // * fails and the transaction is rolled back//
> > // * @throws LockTimeoutException if pessimistic locking//
> > // * fails and only the statement is rolled back//
> > // * @throws PersistenceException if the query execution exceeds //
> > // * the query timeout value set and the transaction //
> > // * is rolled back //
> > // * @see Stream//
> > // * @see #getResultList()//
> > // * @since 2.2//
> > // *///
> > // default Stream getResultStream() {//
> > // return getResultList().stream();//
> > // }/
> >
> > similarly to javax.persistence.TypedQuery:
> >
> > / /**//
> > // * Execute a SELECT query and return the query results//
> > // * as a typed <code>java.util.Stream</code>.//
> > // * By default this method delegates to
> > <code>getResultList().stream()</code>,//
> > // * however persistence provider may choose to override this
> method//
> > // * to provide additional capabilities.//
> > // *//
> > // * @return a stream of the results//
> > // * @throws IllegalStateException if called for a Java//
> > // * Persistence query language UPDATE or DELETE statement//
> > // * @throws QueryTimeoutException if the query execution exceeds//
> > // * the query timeout value set and only the statement is//
> > // * rolled back//
> > // * @throws TransactionRequiredException if a lock mode other than//
> > // * <code>NONE</code> has been set and there is no
> > transaction//
> > // * or the persistence context has not been joined to the
> > transaction//
> > // * @throws PessimisticLockException if pessimistic locking//
> > // * fails and the transaction is rolled back//
> > // * @throws LockTimeoutException if pessimistic locking//
> > // * fails and only the statement is rolled back//
> > // * @throws PersistenceException if the query execution exceeds //
> > // * the query timeout value set and the transaction //
> > // * is rolled back //
> > // * @see Stream//
> > // * @see #getResultList()//
> > // * @since 2.2//
> > // *///
> > // default Stream<X> getResultStream() {//
> > // return getResultList().stream();//
> > // }/
> >
> > Do you think this is OK or does it give providers to big freedom?
> >
> > Thank you,
> > --lukas
> >
> >
>
>