users@jpa-spec.java.net

[jpa-spec users] Re: [jsr338-experts] Re: JPA schema generation

From: Linda DeMichiel <linda.demichiel_at_oracle.com>
Date: Thu, 31 May 2012 12:34:57 -0700

Hi Mark,

More comments below....

thanks again and LieGrue,

-Linda



On 5/30/2012 3:41 PM, Mark Struberg wrote:
> Hi Linda!
>
>
> I finally found some time (and free brain) for answering the 2nd part of the mail (the index story).
>
>
> Answers inline.
>
> txs and LieGrue,
> strub
>
>> ________________________________
>> From: Linda DeMichiel<linda.demichiel_at_oracle.com>
>> To: struberg_at_yahoo.de
>> Sent: Monday, May 21, 2012 10:01 PM
>> Subject: Re: [jpa-spec users] [jsr338-experts] Re: JPA schema generation
>>
>>>
>>> Regarding the Indexes. I'd rather have a typesafe @Index nnotation
>>> which allows for own typesafe index groups.
>>> Just a very first idea:
>>>
>>> public @interface Index {
>>> int value() default Integer.MAX_VALUE; // for ordering the index if
>>> no group annotation is used
>>> boolean unique() default false;
>>> // add asc, desc etc and other stuff as you like
>>> }
>>>
>>
>> Not sure I'm following. I was proposing that if the index included multiple
>> columns that the @Indexes annotation would need to be applied to the class.
>
>
> [strub]
>
> I'm not exactly a fan of any @Indexes annotation on the class itself, because we have such a thingy already (@UniqueContraints). The reason why I don't really like it that it's a.) not easy to maintain, b.) not typesafe (they are just random strings) c.) hard to refactor and d.) hard to detect unused indices.
>

I guess we have different views on this. I see indexes as strictly
a schema-generation item, used largely for performance. Some
databases force you to specify unique indexes for uniqueness constraints,
whereas others recommend that you specify uniqueness constraints (which
they then turn into unique indexes), so I guess we just can't win on that one :-)

I find it much more straightforward to just define a multi-column index
in one place--that takes care of the column ordering without the
need for an integer ordering field. Further, such an index might also
involve columns that map to attributes of embeddables, which would
require additional metadata with the syntax you've proposed, and gets
messy. My expectation is that in production applications, schemas
will be defined very carefully -- for example, we don't define anything
with regard to the ordering of columns in tables, and yet that too
can affect performance -- and indexes will probably be defined at a late stage
at which point column names are certainly well-known.

An open issue in my earlier proposal is whether we should add a
"boolean unique() default false" element to the index annotation.
This would be redundant with what we have already with unique
constraints, and we'd have to define what happens if both were
specified and not in sync, but we should consider it.


>
> We all learned a few tricks while doing this annotation stuff all the years (especially while working on CDI) and meta-annoations would allow us to do all this in a type-safe way. I've wrote up my ideas in a http://java.net/jira/browse/JPA_SPEC-22 including a sample of how this could probably look like.
>
> The basic idea is to create an inner-class Annotation for your multi-column Index and annotate your columns with it.
>
> I'm not sure if the sample in the issue is self-explaining. And it certainly takes some time to get familiar with this kind of style. Any feedback is highly appreciated!
>
> I'm not saying this is THE way, but rather like to showcase other options we have and trigger some brainstorming. So keep your ideas rolling, folks ;)
>
>
>
>>> @Entity
>>> public class MyUser {
>>>
>>> // @Index meta-information to show that this is an index group
>>> private @Index(unique=true)@interface LoginId {
>>> int value() default Integer.MAX_VALUE; // mandatory for @Index
>>> groups: for ordering inside the grp
>>> }
>>> private @Index_at_interface CarLookup {
>>> // if no int value() is specified in an index, the order is
>>> undefined
>>> }
>>>
>>> @LoginId(1);
>>> private String tenant;
>>>
>>> @LoginId(2);
>>> private String userLogin;
>>>
>>> @CarLookup
>>> private String carVendor;
>>>
>>> @CarLookup
>>> private String carType;
>>> }
>>>
>>>
>>> The
>>> @Index an groups should also be respected when sorting the query
>>> criterias. For older DBs you still sometimes need unnecessary
>>> nativeQueries just to get the order right to match the data (first
>>> query
>>> criteria should let you get rid of most data). Not sure if this is
>>> still true nowadays with the latest DBs but I think even Oracle10 had
>>> problems with queries which didnt fit the order of the index.
>>>
>>
>> Please explain further. A good query optimizer should consider the indexes.
>> (But that doesn't *guarantee* that the result would be in that order)
>
>
> [strub]
>
> The order in which you write the columns in your CREATE INDEX statement did matter a lot in the past. The first one should be the one which has the best 'spread'. This gets better with more and more DB systems switching to dynamic statistics and cost based optimizers. But there are still use cases where the column 'ordinal' inside the Index does matter a lot. Thus I'd still like to have a way to explicitly state the number which denotes this ordinal. In some DBs that was so bad that the index was ignored at all if the order of the columns in the query statement didn't fit the order of the columns in the index itself (leading to a full table space scan). Not sure if this is a problem nowadays anymore...
>

Yes, I'm very aware of this issue (and some of the old history here).
BTW, in the database systems I am familiar with, the order of the columns
is definitely significant. The syntax I proposed earlier covers this as well,
although of course it too is not the only way.

>
>
>>> Maybe
>>> for logicalKeys we could do something similar to EmbeddedId for
>>> @NaturalId. Automatically generating a unique index for those fields
>>> and
>>> making them usable in em.find(MyEntity, mylogicalKey) ...
>>>
>>
>> Beyond a certain point I start worrying about too many indexes being
>> added - and that being counterproductive.
>
> [strub]
>
> I agree that we need to take this into consideration. But finally I don't share this sentiment. A logical or 'natural' key is always a unique Index. Otherwise it would not be a natural KEY.
>
> (Btw, I think there is another thread already where Steve pointed out that this might even be used to do an em.find(naturalkey) kind of thingy.)
>
>
> Let's face it: currently you need to create most of your Indices via an externally maintained create_index.sql file. And this file is for sure much harder to maintain than having this information right in the entity, isn't?
>
> So yes, we need to take care about people adding too many indices - but overall we would improve the situation imo, wdyt?
>
>
>
>>> More about it in JPA_SPEC-22
>>>
>>>
>>>
>>> txs and LieGrue,
>>> strub
>>>
>>>
>>> PS:
>>> hello everybody and thanks for the hard work on JPA! I'm some random
>>> old jerk who started writing DB frameworks some decades ago on the
>>> mainframe (in C on OS/360), and use/write db stuff in Java since late
>>> 90s or so.
>>
>> OS/360? Wow.
>
> [strub]
> That was in the early 90s. This specific stock exchange program switched to OS/390 later and still running z/OS nowadays afaik.
> And dumps and ABENDs basically still look the same on a Host, even today ;)
>
>
>>
>>> PPS: plz excuse me if I miss something obvious...
>>
>> I don't think you have. Thanks again for taking the trouble to write this up.
>>
>> -Linda
>>