users@jpa-spec.java.net

[jpa-spec users] Re: Full text search

From: Christian Beikov <christian.beikov_at_gmail.com>
Date: Tue, 29 May 2012 16:00:29 +0200

Hibernate search supports batching within transactions, generally the
behavior described in the docs could be reused. What kind of limitations
are you thinking of? If the index for the full text search is managed by
the DBMS, there shouldn't be any problems?

IMO JPA should provide the annotations for the full text indexing and
extend the EntityManager, Query, etc. to allow a nice way to search for
data. The implementation of such a search provider should be pluggable,
you could for example configure your JPA provider(in a standardized way)
to use the DBMS, Lucene or any other full text search engine as the
provider.

Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*

Am 29.05.2012 14:22, schrieb Christian Romberg:
> Hi Christian,
>
> No, it's not that simple.
>
> Updating full-text indexes takes an enourmous amount
> of time compared to the time needed for all other things that happen
> in a database commit operation.
>
> Thus, for most users it's hardly usable to do transactional full text
> index updates.
>
> Batching such updates would be the way to go, however this imposes
> quite some limitations.
>
> It's totally fine for any JPA vendor to provide special extensions for
> special use cases, or use cases with serious restrictions.
>
> However, and this is just my personal opinion, the scope of any
> specifications should not include such, they don't make
> a specification sound but brittle instead.
>
> Regards,
>
> Christian
>
> On Tue, May 29, 2012 at 2:10 PM, Christian Beikov
> <christian.beikov_at_gmail.com <mailto:christian.beikov_at_gmail.com>> wrote:
>
> Hello!
>
> I have just read some lines of the documentation of hibernate
> search and found out that the index is written transactional if a
> transaction exists. In other words, the implementation would have
> to participate in a JDBC or JTA transaction.
>
> Here a little excerpt of the documentation(also see
> http://docs.jboss.org/hibernate/search/3.2/reference/en/html_single/#d0e488):
>
> To be more efficient, Hibernate Search batches the write
> interactions with the Lucene index. There is currently two
> types of batching depending on the expected scope. Outside a
> transaction, the index update operation is executed right
> after the actual database operation. This scope is really a no
> scoping setup and no batching is performed. However, it is
> recommended - for both your database and Hibernate Search - to
> execute your operation in a transaction be it JDBC or JTA.
> When in a transaction, the index update operation is scheduled
> for the transaction commit phase and discarded in case of
> transaction rollback. The batching scope is the transaction.
> There are two immediate benefits:
>
> *
>
> Performance: Lucene indexing works better when operation
> are executed in batch.
>
> *
>
> ACIDity: The work executed has the same scoping as the one
> executed by the database transaction and is executed if
> and only if the transaction is committed. This is not ACID
> in the strict sense of it, but ACID behavior is rarely
> useful for full text search indexes since they can be
> rebuilt from the source at any time.
>
> You can think of those two scopes (no scope vs transactional)
> as the equivalent of the (infamous) autocommit vs
> transactional behavior. From a performance perspective, the
> /in transaction/ mode is recommended. The scoping choice is
> made transparently. Hibernate Search detects the presence of a
> transaction and adjust the scoping.
>
> Adapting this would fulfill your requirement, wouldn't it?
>
>
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
>
> Am 29.05.2012 09:08, schrieb Christian Romberg:
>> Hi Christian,
>>
>> Normal indexes in standard ACID databases (regardless whether an
>> RDBMS or an ODBMS like ours) are transactionally consistent.
>>
>> With full text indexes this becomes a problem, and I think this
>> is the point, which would need to be discussed before discussing any
>> integration in JPA.
>>
>> Regards,
>>
>> Christian
>>
>> On Sun, May 27, 2012 at 9:31 PM, Christian Beikov
>> <christian.beikov_at_gmail.com <mailto:christian.beikov_at_gmail.com>>
>> wrote:
>>
>> Before adding another issue to JIRA I wanted to discuss the
>> following.
>> Mark Struberg has added the issue for indices,
>> http://java.net/jira/browse/JPA_SPEC-22
>> Depending on this issue I would like to see something like
>> full text search support/integration in JPA.
>>
>> Hibernate has the possibility, via hibernate search, to
>> create full text queries. The entity manager interface would
>> probably have to be extended if a similar approach would be
>> offered in JPA. I would really like to see a standaradized
>> way of integrating full text search in JPA via search
>> providers or so.
>>
>> What do you think?
>> --
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*
>>
>>
>>
>>
>> --
>> Christian Romberg
>> Chief Engineer| Versant GmbH
>> (T) +49 40 60990-0 <tel:%2B49%2040%2060990-0>
>> (F) +49 40 60990-113 <tel:%2B49%2040%2060990-113>
>> (E) cromberg_at_versant.com <mailto:cromberg_at_versant.com>
>> www.versant.com
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.versant.com%2F&sa=D&sntz=1&usg=AFrqEzeeEBc_gN_8mxtt8xDB0tjXDXQVlw>|
>> www.db4o.com
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.db4o.com%2F&sa=D&sntz=1&usg=AFrqEzdo3Q40RwKQPBtnPIuBYQd1diFxJQ>
>>
>> --
>> Versant
>> GmbH is incorporated in Germany. Company registration number: HRB
>> 54723, Amtsgericht Hamburg. Registered Office: Halenreie 42, 22359
>> Hamburg, Germany. Geschäftsführer: Bernhard Wöbker, Volker John
>>
>> CONFIDENTIALITY
>> NOTICE: This e-mail message, including any attachments, is for
>> the sole
>> use of the intended recipient(s) and may contain confidential or
>> proprietary information. Any unauthorized review, use, disclosure or
>> distribution is prohibited. If you are not the intended recipient,
>> immediately contact the sender by reply e-mail and destroy all
>> copies of
>> the original message.
>>
>>
>>
>
>
>
> --
> Christian Romberg
> Chief Engineer| Versant GmbH
> (T) +49 40 60990-0
> (F) +49 40 60990-113
> (E) cromberg_at_versant.com <mailto:cromberg_at_versant.com>
> www.versant.com
> <http://www.google.com/url?q=http%3A%2F%2Fwww.versant.com%2F&sa=D&sntz=1&usg=AFrqEzeeEBc_gN_8mxtt8xDB0tjXDXQVlw>|
> www.db4o.com
> <http://www.google.com/url?q=http%3A%2F%2Fwww.db4o.com%2F&sa=D&sntz=1&usg=AFrqEzdo3Q40RwKQPBtnPIuBYQd1diFxJQ>
>
> --
> Versant
> GmbH is incorporated in Germany. Company registration number: HRB
> 54723, Amtsgericht Hamburg. Registered Office: Halenreie 42, 22359
> Hamburg, Germany. Geschäftsführer: Bernhard Wöbker, Volker John
>
> CONFIDENTIALITY
> NOTICE: This e-mail message, including any attachments, is for the sole
> use of the intended recipient(s) and may contain confidential or
> proprietary information. Any unauthorized review, use, disclosure or
> distribution is prohibited. If you are not the intended recipient,
> immediately contact the sender by reply e-mail and destroy all copies of
> the original message.
>
>
>