[javaee-spec users] Re: DataSourceDefinition

From: Reza Rahman <reza.rahman_at_oracle.com>
Date: Wed, 27 Aug 2014 09:12:23 -0400

If memory serves correctly, password aliasing for these annotations was something that was deferred in EE 7. That's one of the principal ways other plain text configuration based software approaches the visible credentials issue.

I imagine that's a thread of thought any config JSR effort would likely pick up.

> On Aug 27, 2014, at 8:25 AM, arjan tijms <arjan.tijms_at_gmail.com> wrote:
>
> Hi,
>
>> On Wed, Aug 27, 2014 at 10:02 AM, Mark Struberg <struberg_at_yahoo.de> wrote:
>> Arjan, so you have the passwords of all your stages checked in to SCM and in your EAR/WAR in plaintext?
>
> The dev databases are available to everyone, so yes, the passwords for those are always in the SCM in plaintext. The SCM itself is of course protected and only the trusted team has access to it. When you separate the config from the EAR/WAR the passwords are -somewhere- as well, often times in a SCM too and deployed to the servers via things like cf-engine or chef. It boils down to the same thing really.
>
> The live database is only accessible from a very limited number of white listed IPs in a secured zone, so even if the live DB password would leak there's little that can be done with it.
>
> Yet, for situations where the code is accessible to more people than the trusted team, then yes indeed the live username and password are not checked into the code SCM but are provided separately. The approach that I outlined in the JDevelopment article doesn't exclude settings being loaded from alternative sources, and those sources could be the local file system just as well.
>
>
>> What is your ops team and the security guys saying about it?
>
> There's no separate ops team. There's one (small) multi-disciplinary team that is responsible for nearly everything, e.g. from initial design to coding, deployment and monitoring. Some team members are naturally more an expert on certain field than others, but ultimately the team is responsible and involved at every stage.
>
> There's thus no throwing things over the wall so to speak and no need to convince people of things needing to be done that are not directly of interest to those people. E.g. asking an ops member to create a JMS queue, when the ops member has no idea what the queue should be for (since that is solely an internal application concern) and is thus not particularly excited about creating said queue (just a mundane task to do).
>
> Instead, in our process the person creating the queue is also the person having a need to create that queue and is thus motivated to do so. In a small team full of experienced people (which I hope to belief we have at zeef) this makes the entire development process much smoother.
>
>
>> And how do you treat different databases?
>
> When you build an in-house application that you deploy on your own servers (like m4n.nl and zeef.com does) then there are no unknown amount of different databases to support. There's one type of database (e.g. Postgres) and if that ever changes you change its driver in said configuration file. Don't forget though that in the wrapper data source the *actual* data source classname also comes from the configuration file, so although unneeded for our particular use case you can easily swap this out for anything (and since as I explained settings can come from anywhere, this can be done externally as well).
>
>
>> Or an application which runs at a customer and you don't have any credentials at all?
>
>
> The process I described works best for in-house development where you deploy to your own servers. Applications that you develop for a general public (e.g. products like JIRA) or for customers are less suited. But that's why I mentioned before that both approaches have their use. One approach is not inherently better than the other. It depends on the use case.
>
> Historically Java EE has focussed on the use case for highly separated roles, but failed to acknowledge not all teams work in that way. It's good that Java EE now increasingly acknowledges the lighter and more agile way of working, in addition to the more traditional way. As said, neither way of working is better. It's just a matter of Java EE being capable of scaling both up and down.
>
>
>> This solution just doesn't scale...
>
> In fact it actually appears to do. The key is that you don't look at @DataSourceDefinition and related elements as the sole way to do things (which would indeed be crazy), but as an essential piece of a spectrum of possibilities.
>
> In terms of number of servers the approach also seems to scale. At m4n.nl we scaled from 1 single server that had everything (jn 2002) to a hundred servers or so in 2011. And embedded data sources nicely scaled along (at first we used a proprietary solution for this, later the standardized Java EE version).
>
>
>> Old trick. I wrote something similar 4 years ago for CODI [1][2] (it's actually much older, a colleague and I first wrote this around 2006).
>> But we decided to ditch it and not move it over to DeltaSpike as it doesn't work on all containers when it comes to JTA. Even if you do a ConfigurableXaDataSource. The problem is that some containers evaluate the settings even before your app is booted (for doing JPA instrumentation, etc). Creates funny NPEs...
>
> Interesting. Any idea which container that might be? I tested the ConfigurableXaDataSource mainly with JBoss and GlassFish and at least there it worked.
>
> But one way or the other, the Configurable(Xa)DataSource is of course a workaround, a hack if you like, and the issue should be solved in a better way.
>
>
>> And forget about JNDI. It just stinks. A DataSource configured on the container pops up on a different location for almost every container. The JNDI location sometimes even changes between different versions of the same container.
>
> I've seen that indeed, so this is one additional advantage of @DataSourceDefinition and friends. When you define the thing to be in "java:app/ds/myds" it will actually end up in "java:app/ds/myds", and not in "vendorname:app/ds/myds" or "java:jdbc/app/ds/myds" or whatever.
>
>
>> Fully agree, but I think this should not be fixed as the whole approach is imo broken.
>> So let's review and then deprecate this annotation based config.
>
> I personally don't see any need to deprecate the annotation based config, but I do absolutely see the need for improvements. I actually had been working on preparing a JIRA issue for this a while back, but have not yet finished it. The gist of it is approximately the following:
>
> In addition to @DataSourceConfiguration, have an additional programmatic way to provide a data source, roughly like how in the Servlet spec you can use a programmatic API to register Servlets during startup in addition to annotations and XML. We could use CDI events or perhaps a qualified producer for this.
>
> Using a producer would approximately look like this:
>
> @Produces @DataSourceDefinition
> XADataSource produceMyDataSource(DataSourceContainer container) {
> container.setMinPool(20);
> container.setProperty("vendorx.validation-statement", "select 1");
> // ...
> XADataSource myDataSource = new ....
> // ...
> return myDataSource;
> }
>
> The producer could load its settings from everywhere e.g. using DeltaSpike config, or (if/when it becomes available) use the Config JSR. "DataSourceContainer" is a new type to distinguish between data source settings and container settings. Something like this should be reflected in the annotation and XML variant as well. It should have both well defined standardised settings (like the existing min pool) and the ability to set vendor specific ones (SQL validation in this example).
>
> Furthermore @DataSourceDefinition should be capable of having placeholders in its attributes, and there should be a facility to override it externally (which are both things the Config JSR should be able to provide).
>
> Kind regards,
> Arjan Tijms
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>
>> LieGrue,
>> strub
>>
>>
>>
>>
>> [1] https://github.com/apache/myfaces-extcdi/blob/trunk/jee-modules/jpa-module/api/src/main/java/org/apache/myfaces/extensions/cdi/jpa/api/datasource/DataSourceConfig.java
>>
>> [2] https://github.com/apache/myfaces-extcdi/blob/trunk/jee-modules/jpa-module/impl/src/main/java/org/apache/myfaces/extensions/cdi/jpa/impl/datasource/ConfigurableDataSource.java
>>
>>
>>
>>
>> On Tuesday, 26 August 2014, 23:06, arjan tijms <arjan.tijms_at_gmail.com> wrote:
>> >
>> >
>> >Hi,
>> >
>> >
>> >On Tue, Aug 26, 2014 at 8:12 PM, Arun Gupta <arun.gupta_at_gmail.com> wrote:
>> >
>> >There is clear evidence that nobody is using @DataSourceDefinition in
>> >>production code. See the conversation at:
>> >>
>> >>https://twitter.com/arungupta/status/504039335688404992
>> >>
>> >>Seems like its good only for demos.
>> >
>> >
>> >I hate to be at the disagreeing side lately ;) but I disagree.
>> >
>> >
>> >At zeef.com we definitely are using @DataSourceDefinition in production (albeit the xml variant of this in application.xml). In our development process development and configuration is done within the same team. Inside each deployable application we have a directory with sub-directories holding the config for every stage. The advantage is that everybody is able to see which config applies to which stage and can keep the config in sync with the actual code.
>> >
>> >
>> >At m4n.nl where I worked before we had a similar setup, although before we introduced that we had the separate config that was advocated at the time as a best practice. This separate config didn't really work well for us; it was frequently out of sync with the code, configuration kept growing and old keys that no code was using anymore kept piling up (because the developers didn't saw the configuration and the sysop didn't necessarily saw the code). Worse, when there were live issues it wasn't clear which values the live code was actually using. Did a thread pool had more threads than there were connections, or the other way around?
>> >
>> >
>> >So the concept of defining a data source from within the app, which @DataSourceDefinition facilitates, is crucial for our process.
>> >
>> >
>> >Another important thing is that @DataSourceDefinition/the data-source element remains stable by virtue of the spec. Some vendors unfortunately often change the way their proprietary data source is configured, or even worse, remove a way altogether. One version a data source is specified in XML format 1, half a version later it's in incompatible format 2, then it's a deployable artifact, then it's not a deployable artifact anymore, and then surprise it is again. One version the data source even though proprietary can be embedded in an EAR, then the next version it can't be embedded anymore. One version it has to be defined in separate XML file, then one version later it goes into one big configuration file (with of course has yet again a different XML format), etc etc.
>> >
>> >
>> >The biggest issue however with the current @DataSourceDefinition/data-source element is that it's not directly configurable. This was my main motivation for creating https://java.net/jira/browse/JAVAEE_SPEC-19
>> >
>> >
>> >In the meantime I solved the configuration problem a little by using a data source wrapper that reads configuration based on a parameter and uses that to configure the real data source. This data source wrapper is then registered using the data-source element. I outlined the approach here: http://jdevelopment.nl/switching-data-sources-datasourcedefinition
>> >
>> >
>> >There is some more room for improvement in @DataSourceDefinition though. Specifically there are now vendor specific properties that are supposed to go to the data source (e.g. for the Postgres or MySql driver), but there is no mechanism for setting vendor specific properties for the container (e.g. for JBoss or GlassFish). Things like transaction recovery or some advanced pooling settings are intended for the container, not the data source, but there now is no good way to configure that other than by some naming convention.
>> >
>> >
>> >Long story short (TL;DR):
>> >
>> >
>> >* @DataSourceDefinition is definitely used in production
>> >
>> >* Configuration is issue, but can be solved today. (I hope that config JSR does this even better)
>> >* Room for general improvements
>> >
>> >
>> >Kind regards,
>> >Arjan
>> >
>> >
>> >
>> >
>> >
>> >I'd urge platform EG and other EGs
>> >>in Java EE 8 to strongly consider adding a similar annotation.
>> >>
>> >>Cheers
>> >>Arun
>> >>
>> >>--
>> >>http://blog.arungupta.me
>> >>http://twitter.com/arungupta
>> >>
>> >
>> >
>> >
>> >
>