[jpa-spec users] [jsr338-experts] Re: JPA schema generation

From: Evan Ireland <eireland_at_sybase.com>
Date: Thu, 10 May 2012 08:39:16 +1200

Linda,

My five cents worth (no 2 cents here, due to inflation!)

1(a) is typically of interest to DBAs, who don't want to let any DML be run
in "their" database if they haven't seen/checked the DML.

1(b) is of great use to developers, but also is valuable when you have a
deployment of an application that the DBA doesn't know about the schema for
(e.g. they weren't involved in defining it, as it came from a third party),
so the DBA isn't so likely to be able to usefully "vet" it anyway.

4 (index meta-data in particular) is absolutely essential.

If we only do one thing (no doubt we will do more than one thing), then
index meta-data (e.g. via annotations) in my mind is the most critical.

> -----Original Message-----
> From: Linda DeMichiel [mailto:linda.demichiel_at_oracle.com]
> Sent: Thursday, 10 May 2012 6:37 a.m.
> To: jsr338-experts_at_jpa-spec.java.net
> Subject: [jsr338-experts] JPA schema generation
>
>
> In conjunction with support for use of JPA in PaaS environments, we
> will need to address the issue of schema generation. After we discuss
> this area, I'd like to return to a broader discussion of multitenancy
> support with this in view.
>
>
> Here's what I think we'll need to address:
>
> 1. How/when schema generation happens:
> a. at a deployment time phase (or predeployment time phase)
> b. at EMF creation time
>
> 2. Providing an option whereby the application can supply DDL for
> schema generation as an alternative to the persistence provider
> doing schema generation based on the ORM metadata
>
> 3. How, once schema generation has happened, tables can get
> loaded with
> data. If we enable the application to provide DDL scripts for
> schema generation, then supporting the use of SQL bulk
> insert scripts
> would go hand-in-hand with this, but perhaps there are also other
> approaches, e.g., bulk insert via JPQL(?).
>
> 4. The need for additional ORM metadata: e.g. metadata for index
> generation; for foreign key constraints; other(?).
>
> 5. More precise definition of existing metadata that can be used
> for schema generation: what it comprises, and how it can/must
> be used.
>
> 6. Additional persistence.xml level metadata and additional metadata
> the platform provider passes to the persistence provider
> to control
> or configure the operations outlined above.
>
>
> I am assuming for that Java EE 7 we need to support both the
> separate database and shared database (aka separate schema)
> approaches.
>
> I am also assuming that it is the platform provider that does the
> provisioning of either the database or the schema respectively (i.e.,
> in accordance with whether the separate database or shared database
> approach is taken). I.e., in the separate database approach, I'm
> assuming that it is the persistence provider that creates the schemas
> directly in the database or creates the DDL to generate the schemas
> needed for the persistence unit. [I'll use the term "schema
> generation" generically below to mean either table generation or
> schema+table generation with this assumption in view.]
>
>
> Here's a proposal to get the discussion started.
>
> SCHEMA GENERATION
>
> I think we should allow schema generation to be done either as a
> deployment (or pre-deployment) phase or at createEMF time.
>
> Deployment-time (or predeployment-time) schema generation could be
> done either by the platform provider calling into the
> PersistenceProvider implementation to generate the database schema or
> by the platform provider running SQL DDL scripts.
>
> SQL DDL scripts might be provided with the application (bundled as
> part of the persistence unit) or the platform provider might invoke
> the PersistenceProvider implementation to generate them from the
> persistence unit metadata. If scripts are provided or generated, it
> should be the platform provider's responsibility to run them against
> the database. If the application has provided DDL scripts, the
> application should also have the option to provide SQL scripts to do
> a bulk insert of data. It would be the responsibility of the
> platform provider to run these as well.
>
> An alternative to the use of scripts is for the platform provider to
> call into the PersistenceProvider to directly generate the tables in
> the database. In the case that a database has been provisioned rather
> than only a schema, the persistence provider would generate the
> schemas and tables.
>
> For the sake of argument, let's assume that we add a generateSchema()
> method to the PersistenceProvider interface, which the platform
> provider invokes during the deployment phase if SQL scripts have not
> been provided by the application.
>
> Information that needs to be passed in would include the following:
>
> (1) Information that specifies how schema generation is to proceed.
> This information could take the form of a SchemaGenerationInfo
> interface or it could be in the form of additional
> standard properties.
> (2) Connection to the database or schema. [Is this a Connection or a
> DataSource ?]
> (3) PersistenceUnitInfo
>
>
> The SchemaGenerationInfo should include the following:
>
> ddl-generation-mode: GENERATE_AND_EXECUTE | GENERATE_ONLY
> | EXECUTE_ONLY
>
> GENERATE_AND_EXECUTE => generate ddl scripts and
> generate schema (if
> applicable) and tables directly in the database
>
> GENERATE_ONLY => generate ddl scripts
>
> EXECUTE_ONLY => generate schema (if applicable) and
> tables directly in
> the database
>
> create-ddl-target: a Writer configured for the
> persistence provider
> for outputting of the DDL scripts for the creation of
> schema/tables.
> This should be null if EXECUTE_ONLY
> ddl-generation-mode is specified.
>
> drop-ddl-target: a Writer configured for the persistence provider
> for outputting of the DDL scripts for the dropping of
> schema/tables.
> The should be null if EXECUTE_ONLY ddl-generation-mode
> is specified.
>
> ddl-execution-mode: Used when executing directly against
> the database,
> rather than generating scripts. Options should include:
> CREATE_TABLES | DROP_TABLES | DROP_AND_CREATE_TABLES
> Do we also need to distinguish
> CREATE_SCHEMAS_AND_TABLES ? DROP_SCHEMA ?
> Or can these be implicit?
> Do we also need an ALTER_TABLES capability here?
>
> properties: These could include any vendor-specific properties
>
>
> ADDITIONAL METADATA (persistence.xml)
>
> If scripts are provided with the application (i.e., bundled as part
> of the persistence unit), we should add entries in the persistence.xml
> to identify them:
> create-ddl-script
> drop-ddl-script
> data-load-script(s) [Do we need to support more than
> one load script?
> If so, the ordering may need to be specified as well.]
>
> Open issue: do we need metadata in the persistence.xml for
> ddl-execution-mode as well? For example, this might be used in
> standalone scenarios (?)
>
>
> ORM METADATA (annotations and orm.xml)
>
> (1) Indexes
>
> I think we need to support indexes.
>
> Here's a strawman annotation:
>
> @Target({}) @Retention(RUNTIME)
> public @interface Index {
> String name() default "";
> String columnList();
> }
>
> The columnList syntax could follow that of the OrderBy annotation:
>
> columnList::= indexColumn [, indexColumn]*
> indexColumn::= columnName [ASC | DESC]
>
> If ASC or DESC is not specified, ASC is assumed
>
> We'd also need to add
> Index[] indexes() default {};
> to Table, Secondary Table, CollectionTable, JoinTable, and
> TableGenerator
>
>
> (2) Foreign Key Constraints
>
> I see two possible approaches:
>
> (a) Add a string-valued foreignKeyDefinition element to JoinColumn,
> JoinColumns, MapKeyJoinColumn, etc. to specify a SQL fragment
> (intended along the lines of columnDefinition) for defining a foreign
> key constraint and/or for overriding the persistence provider's
> default foreign key definition. It might also be helpful to allow a
> name for the foreign key constraint to be specified for the case where
> the provider is using its default foreign key generation strategy.
>
> or
>
> (b) Add a ForeignKey annotation to specify the foreign key
> constraint --
> for example,
>
> public @interface ForeignKey {
> String name() default "";
> ForeignKeyAction updateAction() default NO_ACTION;
> ForeignKeyAction deleteAction() default NO_ACTION;
> boolean deferred() default false;
> }
>
> public enum ForeignKeyAction {NO_ACTION, CASCADE, SET_NULL};
>
> I have some misgivings about approach (b), and, given a lack of
> database portability here, we might wind up needing a
> foreignKeyDefinition
> fragment anyway.
>
>
> (3) Other
>
> What about the ability to distinguish a CHAR rather than VARCHAR
> mapping for strings? Or should we just leave this as belonging in a
> SQL fragment (with the understanding that we would be intentionally
> discouraging the use of CHAR strings)?
>
> (4) Anything else I'm missing?
>
>
> APIs
>
> I assumed the addition of a
> PersistenceProvider.generateSchema() method
> above.
>
> If we also support schema generation at EMF creation time, do we want
> another createContainerEntityManagerFactory() method that takes a
> SchemaGenerationInfo as well as PersistenceUnitInfo argument ?
>
> If generation is done directly in the database, at some point the
> container (or application) may need to have the schema/tables dropped.
> I'm not sure of the best way to support this. A
> dropAndClose() method
> on the EMF??
>
>
>
>