users@jpa-spec.java.net

[jpa-spec users] Re: Full text search

From: Christian Romberg <cromberg_at_versant.com>
Date: Tue, 29 May 2012 14:22:59 +0200

Hi Christian,

No, it's not that simple.

Updating full-text indexes takes an enourmous amount
of time compared to the time needed for all other things that happen in a
database commit operation.

Thus, for most users it's hardly usable to do transactional full text index
updates.

Batching such updates would be the way to go, however this imposes quite
some limitations.

It's totally fine for any JPA vendor to provide special extensions for
special use cases, or use cases with serious restrictions.

However, and this is just my personal opinion, the scope of any
specifications should not include such, they don't make
a specification sound but brittle instead.

Regards,

Christian

On Tue, May 29, 2012 at 2:10 PM, Christian Beikov <
christian.beikov_at_gmail.com> wrote:

> Hello!
>
> I have just read some lines of the documentation of hibernate search and
> found out that the index is written transactional if a transaction exists.
> In other words, the implementation would have to participate in a JDBC or
> JTA transaction.
>
> Here a little excerpt of the documentation(also see
> http://docs.jboss.org/hibernate/search/3.2/reference/en/html_single/#d0e488
> ):
>
> To be more efficient, Hibernate Search batches the write interactions with
> the Lucene index. There is currently two types of batching depending on the
> expected scope. Outside a transaction, the index update operation is
> executed right after the actual database operation. This scope is really a
> no scoping setup and no batching is performed. However, it is recommended -
> for both your database and Hibernate Search - to execute your operation in
> a transaction be it JDBC or JTA. When in a transaction, the index update
> operation is scheduled for the transaction commit phase and discarded in
> case of transaction rollback. The batching scope is the transaction. There
> are two immediate benefits:
>
> -
>
> Performance: Lucene indexing works better when operation are executed
> in batch.
> -
>
> ACIDity: The work executed has the same scoping as the one executed by
> the database transaction and is executed if and only if the transaction is
> committed. This is not ACID in the strict sense of it, but ACID behavior is
> rarely useful for full text search indexes since they can be rebuilt from
> the source at any time.
>
> You can think of those two scopes (no scope vs transactional) as the
> equivalent of the (infamous) autocommit vs transactional behavior. From a
> performance perspective, the *in transaction* mode is recommended. The
> scoping choice is made transparently. Hibernate Search detects the presence
> of a transaction and adjust the scoping.
>
> Adapting this would fulfill your requirement, wouldn't it?
>
>
> Mit freundlichen Grüßen,
> ------------------------------
> *Christian Beikov*
>
> Am 29.05.2012 09:08, schrieb Christian Romberg:
>
> Hi Christian,
>
> Normal indexes in standard ACID databases (regardless whether an RDBMS or
> an ODBMS like ours) are transactionally consistent.
>
> With full text indexes this becomes a problem, and I think this is the
> point, which would need to be discussed before discussing any
> integration in JPA.
>
> Regards,
>
> Christian
>
> On Sun, May 27, 2012 at 9:31 PM, Christian Beikov <
> christian.beikov_at_gmail.com> wrote:
>
>> Before adding another issue to JIRA I wanted to discuss the following.
>> Mark Struberg has added the issue for indices,
>> http://java.net/jira/browse/JPA_SPEC-22
>> Depending on this issue I would like to see something like full text
>> search support/integration in JPA.
>>
>> Hibernate has the possibility, via hibernate search, to create full text
>> queries. The entity manager interface would probably have to be extended if
>> a similar approach would be offered in JPA. I would really like to see a
>> standaradized way of integrating full text search in JPA via search
>> providers or so.
>>
>> What do you think?
>> --
>> Mit freundlichen Grüßen,
>> ------------------------------
>> *Christian Beikov*
>>
>
>
>
> --
> Christian Romberg
> Chief Engineer | Versant GmbH
> (T) +49 40 60990-0
> (F) +49 40 60990-113
> (E) cromberg_at_versant.com
> www.versant.com<http://www.google.com/url?q=http%3A%2F%2Fwww.versant.com%2F&sa=D&sntz=1&usg=AFrqEzeeEBc_gN_8mxtt8xDB0tjXDXQVlw>|
> www.db4o.com<http://www.google.com/url?q=http%3A%2F%2Fwww.db4o.com%2F&sa=D&sntz=1&usg=AFrqEzdo3Q40RwKQPBtnPIuBYQd1diFxJQ>
>
> --
> Versant
> GmbH is incorporated in Germany. Company registration number: HRB
> 54723, Amtsgericht Hamburg. Registered Office: Halenreie 42, 22359
> Hamburg, Germany. Geschäftsführer: Bernhard Wöbker, Volker John
>
> CONFIDENTIALITY
> NOTICE: This e-mail message, including any attachments, is for the sole
> use of the intended recipient(s) and may contain confidential or
> proprietary information. Any unauthorized review, use, disclosure or
> distribution is prohibited. If you are not the intended recipient,
> immediately contact the sender by reply e-mail and destroy all copies of
> the original message.
>
>
>
>


-- 
Christian Romberg
Chief Engineer | Versant GmbH
(T) +49 40 60990-0
(F) +49 40 60990-113
(E) cromberg_at_versant.com
www.versant.com<http://www.google.com/url?q=http%3A%2F%2Fwww.versant.com%2F&sa=D&sntz=1&usg=AFrqEzeeEBc_gN_8mxtt8xDB0tjXDXQVlw>|
www.db4o.com<http://www.google.com/url?q=http%3A%2F%2Fwww.db4o.com%2F&sa=D&sntz=1&usg=AFrqEzdo3Q40RwKQPBtnPIuBYQd1diFxJQ>
-- 
Versant
GmbH is incorporated in Germany. Company registration number: HRB
54723, Amtsgericht Hamburg. Registered Office: Halenreie 42, 22359
Hamburg, Germany. Geschäftsführer: Bernhard Wöbker, Volker John
CONFIDENTIALITY
NOTICE: This e-mail message, including any attachments, is for the sole
use of the intended recipient(s) and may contain confidential or
proprietary information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
immediately contact the sender by reply e-mail and destroy all copies of
the original message.