users@jpa-spec.java.net

[jpa-spec users] Re: [jsr338-experts] Re: JPA schema generation

From: Mark Struberg <struberg_at_yahoo.de>
Date: Thu, 31 May 2012 22:09:01 +0100 (BST)

Hi Linda!

I do agree about the argument with the @Embedded field which might be in the index. Thinking a bit about it, the same would apply to table inheritances.


What I do not like is having Column names in any annotation! This is a really unpleasant thing in the @UniqueContraint annotation already! The reason I don't like this is that the name of the column can depend on lots of other things, like e.g. the database dictionary used. I know from my old DB/2 days that we only had 8 characters per table. Some Databases make no distinction between upper/lowercase, others do, etc. All this leads to getting different names when generating the schema for different databases. If you don't fixate the name via @Column then your @UniqueContraint column name strings are basically almost ever broken.


IF we use Strings, then we should denote the entity field names and not the effective column names. This can then easily get mapped to the database column name in the JPA provider. For embedded fields this could be done in an ExpressionLanguage like fashion (separated via dots).



Regarding the unique index vs unique constraint argument: A unique Index always implies a unique constraint, but not the other way around, isn't? In reality afaik most JPA providers already generate a unique index even if you only use @UniqueConstraint. Btw, I don't see adding a @Index(unique=true) making things worse. People will find out pretty soon if they defined too many indices.

What's more important is that there must not be 2 indices with the same name. This also must be checked for InheritanceType.SINGLE_TABLE and JOINED.

LieGrue,
strub



----- Original Message -----
> From: Linda DeMichiel <linda.demichiel_at_oracle.com>
> To: users_at_jpa-spec.java.net; Mark Struberg <struberg_at_yahoo.de>
> Cc:
> Sent: Thursday, May 31, 2012 9:34 PM
> Subject: Re: [jpa-spec users] Re: [jsr338-experts] Re: JPA schema generation
>
> Hi Mark,
>
> More comments below....
>
> thanks again and LieGrue,
>
> -Linda
>
>
>
> On 5/30/2012 3:41 PM, Mark Struberg wrote:
>> Hi Linda!
>>
>>
>> I finally found some time (and free brain) for answering the 2nd part of
> the mail (the index story).
>>
>>
>> Answers inline.
>>
>> txs and LieGrue,
>> strub
>>
>>> ________________________________
>>> From: Linda DeMichiel<linda.demichiel_at_oracle.com>
>>> To: struberg_at_yahoo.de
>>> Sent: Monday, May 21, 2012 10:01 PM
>>> Subject: Re: [jpa-spec users] [jsr338-experts] Re: JPA schema
> generation
>>>
>>>>
>>>> Regarding the Indexes. I'd rather have a typesafe @Index
> nnotation
>>>> which allows for own typesafe index groups.
>>>> Just a very first idea:
>>>>
>>>> public @interface Index {
>>>>       int value() default Integer.MAX_VALUE; // for ordering the
> index if
>>>> no group annotation is used
>>>>       boolean unique() default false;
>>>>       // add asc, desc etc and other stuff as you like
>>>> }
>>>>
>>>
>>> Not sure I'm following.  I was proposing that if the index included
> multiple
>>> columns that the @Indexes annotation would need to be applied to the
> class.
>>
>>
>> [strub]
>>
>> I'm not exactly a fan of any @Indexes annotation on the class itself,
> because we have such a thingy already (@UniqueContraints). The reason why I
> don't really like it that it's a.) not easy to maintain, b.) not
> typesafe (they are just random strings) c.) hard to refactor and d.) hard to
> detect unused indices.
>>
>
> I guess we have different views on this.  I see indexes as strictly
> a schema-generation item, used largely for performance.  Some
> databases force you to specify unique indexes for uniqueness constraints,
> whereas others recommend that you specify uniqueness constraints (which
> they then turn into unique indexes), so I guess we just can't win on that
> one :-)
>
> I find it much more straightforward to just define a multi-column index
> in one place--that takes care of the column ordering without the
> need for an integer ordering field.  Further, such an index might also
> involve columns that map to attributes of embeddables, which would
> require additional metadata with the syntax you've proposed, and gets
> messy.  My expectation is that in production applications, schemas
> will be defined very carefully -- for example, we don't define anything
> with regard to the ordering of columns in tables, and yet that too
> can affect performance -- and indexes will probably be defined at a late stage
> at which point column names are certainly well-known.
>
> An open issue in my earlier proposal is whether we should add a
> "boolean unique() default false" element to the index annotation.
> This would be redundant with what we have already with unique
> constraints, and we'd have to define what happens if both were
> specified and not in sync, but we should consider it.
>
>
>>
>> We all learned a few tricks while doing this annotation stuff all the years
> (especially while working on CDI) and meta-annoations would allow us to do all
> this in a type-safe way. I've wrote up my ideas in a
> http://java.net/jira/browse/JPA_SPEC-22 including a sample of how this could
> probably look like.
>>
>> The basic idea is to create an inner-class Annotation for your multi-column
> Index and annotate your columns with it.
>>
>> I'm not sure if the sample in the issue is self-explaining. And it
> certainly takes some time to get familiar with this kind of style. Any feedback
> is highly appreciated!
>>
>> I'm not saying this is THE way, but rather like to showcase other
> options we have and trigger some brainstorming. So keep your ideas rolling,
> folks ;)
>>
>>
>>
>>>> @Entity
>>>> public class MyUser {
>>>>
>>>>       // @Index meta-information to show that this is an index group
>>>>       private @Index(unique=true)@interface LoginId {
>>>>         int value() default Integer.MAX_VALUE; // mandatory for
> @Index
>>>> groups: for ordering inside the grp
>>>>       }
>>>>       private @Index_at_interface CarLookup {
>>>>         // if no int value() is specified in an index, the order is
>>>> undefined
>>>>       }
>>>>
>>>>       @LoginId(1);
>>>>       private String tenant;
>>>>
>>>>       @LoginId(2);
>>>>       private String userLogin;
>>>>
>>>>       @CarLookup
>>>>       private String carVendor;
>>>>
>>>>       @CarLookup
>>>>       private String carType;
>>>> }
>>>>
>>>>
>>>> The
>>>> @Index an groups should also be respected when sorting the query
>>>> criterias. For older DBs you still sometimes need unnecessary
>>>> nativeQueries just to get the order right to match the data (first
>>>> query
>>>> criteria should let you get rid of most data). Not sure if this is
>>>> still true nowadays with the latest DBs but I think even Oracle10
> had
>>>> problems with queries which didnt fit the order of the index.
>>>>
>>>
>>> Please explain further.  A good query optimizer should consider the
> indexes.
>>> (But that doesn't *guarantee* that the result would be in that
> order)
>>
>>
>> [strub]
>>
>> The order in which you write the columns in your CREATE INDEX statement did
> matter a lot in the past. The first one should be the one which has the best
> 'spread'. This gets better with more and more DB systems switching to
> dynamic statistics and cost based optimizers. But there are still use cases
> where the column 'ordinal' inside the Index does matter a lot. Thus
> I'd still like to have a way to explicitly state the number which denotes
> this ordinal. In some DBs that was so bad that the index was ignored at all if
> the order of the columns in the query statement didn't fit the order of the
> columns in the index itself (leading to a full table space scan). Not sure if
> this is a problem nowadays anymore...
>>
>
> Yes, I'm very aware of this issue (and some of the old history here).
> BTW, in the database systems I am familiar with, the order of the columns
> is definitely significant.  The syntax I proposed earlier covers this as well,
> although of course it too is not the only way.
>
>>
>>
>>>> Maybe
>>>> for logicalKeys we could do something similar to EmbeddedId for
>>>> @NaturalId. Automatically generating a unique index for those
> fields
>>>> and
>>>> making them usable in em.find(MyEntity, mylogicalKey) ...
>>>>
>>>
>>> Beyond a certain point I start worrying about too many indexes being
>>> added - and that being counterproductive.
>>
>> [strub]
>>
>> I agree that we need to take this into consideration. But finally I
> don't share this sentiment. A logical or 'natural' key is always a
> unique Index. Otherwise it would not be a natural KEY.
>>
>> (Btw, I think there is another thread already where Steve pointed out that
> this might even be used to do an em.find(naturalkey) kind of thingy.)
>>
>>
>> Let's face it: currently you need to create most of your Indices via an
> externally maintained create_index.sql file. And this file is for sure much
> harder to maintain than having this information right in the entity, isn't?
>>
>> So yes, we need to take care about people adding too many indices - but
> overall we would improve the situation imo, wdyt?
>>
>>
>>
>>>>     More about it in JPA_SPEC-22
>>>>
>>>>
>>>>
>>>> txs and LieGrue,
>>>> strub
>>>>
>>>>
>>>> PS:
>>>> hello everybody and thanks for the hard work on JPA! I'm some
> random
>>>> old jerk who started writing DB frameworks some decades ago on the
>>>> mainframe (in C on OS/360), and use/write db stuff in Java since
> late
>>>> 90s or so.
>>>
>>> OS/360?  Wow.
>>
>> [strub]
>> That was in the early 90s. This specific stock exchange program switched to
> OS/390 later and still running z/OS nowadays afaik.
>> And dumps and ABENDs basically still look the same on a Host, even today ;)
>>
>>
>>>
>>>> PPS: plz excuse me if I miss something obvious...
>>>
>>> I don't think you have.  Thanks again for taking the trouble to
> write this up.
>>>
>>> -Linda
>>>
>