Hello Expert group and users,
I'm forwarding a conversation I started with the JSR 317 group back in
April 2010 about a manual flush mode idea. Those of us who have been
following JPA have known this has been a requested feature since JPA 1.0.
I've scanned the archives of the expert mailing list and the user list and
didn't see anything that seemed like it reflected this conversation, if
this has already been addressed, my apologies.
Forwarded conversation
Subject: Features for the next JPA
------------------------
From: *Jason Porter* <lightguard.jp_at_gmail.com>
Date: Wed, Apr 14, 2010 at 12:25
To: jsr-317-feedback_at_sun.com
FlushMode.MANUAL
Reasoning: Imagine I have a wizard like form that takes you through a
series of steps. You can cancel the wizard at any time. If you cancel and
don't complete the process either the records that were persisted need to
be removed, or with a manual flush option I can flush once at the end of
the wizard (keeping state in a conversation or a session) and just write to
the database once. That's the power of having a manual flush. I'm not done
with my unit of work, so don't write anything until I tell you to, when I'm
done with the unit of work.
It would be the same in a desktop / fat client application, why commit
things to the database and have to possibly clean it out if I don't need to?
Standardization of audit log / versioning API
Reasoning: Both Hibernate (via Envers) and Eclipselink have this ability.
It also comes up very frequent in enterprise applications, and we have to
look to non-standard solutions or roll our own. A standard API would be
very helpful.
--
Jason Porter
Software Engineer
Open Source Advocate
PGP key id: 926CCFF5
PGP key available at: keyserver.net, pgp.mit.edu
----------
From: *Evan Ireland* <eireland_at_sybase.com>
Date: Wed, Apr 14, 2010 at 14:41
To: Jason Porter <lightguard.jp_at_gmail.com>
Cc: jsr-317-feedback_at_sun.com
Jason,
Agreed on FlushMode.MANUAL.
I'm not convinced regharding auditing. Yes it comes up in many enterprise
applications, but the requirements vary considerably from system to system.
What did you have in mind with regard to a versioning API?
----------
From: *Jason Porter* <lightguard.jp_at_gmail.com>
Date: Wed, Apr 14, 2010 at 15:16
To: Evan Ireland <eireland_at_sybase.com>
Cc: "<jsr-317-feedback_at_sun.com>" <jsr-317-feedback_at_sun.com>
I'm only familiar with Envers, so I'm not sure what other solutions do.
Envers creates a duplicate of your table(s) with additional columns such as
change date. Whenever a change happens to your table(s) an entry is put
into the other table, this also allows for a rudimentary audit log (if
other data such as user that made the change could be added) as well.
Another solution (though it kinda reinvents what some dbs already do) would
be to store the SQL that was run, that would allow a replay log to some
degree.
It also would allow for a restore of previous states if needed (not sure if
that would/should be part of a spec atm).
Sent from my iPhone
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Wed, Apr 14, 2010 at 17:39
To: Jason Porter <lightguard.jp_at_gmail.com>
Cc: jsr-317-feedback_at_sun.com
**
Hi Jason,
While I agree with the use case of a longer lived application transaction,
a Flushmode.Manual option is not the right solution for the problem,
particularly in the context of multiple resource JTA transactions. A
longer-lived pseudo-transactional context should be transactionally
consistent and have expected transactional semantics. A FlushMode.Manual
option (of the variety promoted by some people/products) on the persistence
context is not the solution to what you are describing and does not exhibit
the correct transaction behavior (in your example multiple JTA transactions
may have been committed in the interim since the wizard started, and
deciding to simply not persist the entities at any time can render a
potentially inconsistent result in those transactions).
The solution is to have a proper application transaction mechanism that
behaves as one might expect an application transaction to behave. This
should involve more than just the persistence layer, although if the
primary use case is that only a single persistence resource is needed/used
then we might be able to come up with a special transaction flag, or add a
wonky option to JTA to allow for that. Not convinced that would be very
nice, though.
----------
From: *Evan Ireland* <eireland_at_sybase.com>
Date: Wed, Apr 14, 2010 at 17:45
To: michael.keith_at_oracle.com, Jason Porter <lightguard.jp_at_gmail.com>
Cc: jsr-317-feedback_at_sun.com
Mike,
If you are working with versioned entities, then an application transaction
that spans
multiple JTA transactions can still ensure repeatable read semantics for any
updated
entities. So can other OCC schemes such as "compare all columns" at update
time.
If you use non-versioned entities, then I would agree that you may be asking
for trouble.
This technique of using version (timestamp) or all-columns verification for
updates
has been very successful and well-proven in PowerBuilder for many years now.
Perhaps you could elaborate on the kinds of inconsistency that you are
concerned about
that could crop up for versioned entities?
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Wed, Apr 14, 2010 at 18:00
To: Evan Ireland <eireland_at_sybase.com>
Cc: Jason Porter <lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
The audit usecase was given, so let's take that as an example. Say the
audit is being written to an entirely different database using JDBC, or
even sent to a remote audit log system through JMS. At some intermediate
phase the wizard causes some entitiy changes to an extended persistence
context that is set to only flush MANUALly. In the same transaction the
audit entry indicating the change to that entity gets sent off through a
transactional JMS queue to the remote audit system. The JTA transaction
commits (successfully) and the JMS message gets sent off. Further
downstream of the wizard (after X more steps and Y more JTA transactions
successfully committing) Joe user decides he doesn't really want to finish
what he started so he hits the cancel button in the wizard, causing all of
the changes in the persistence context to be discarded and not be flushed.
The transactional no-no is that JTA transactions are being committed
assuming that the entity changes have also been committed, but
FlushMode.Maanual is delinquently ignoring the commit and assuming its own
control over the persistent data, ignoring the real transaction scope in
favour of its own aritificially created scope. Atomicity, consistency,
durability -- all gone.
----------
From: *Jason Porter* <lightguard.jp_at_gmail.com>
Date: Wed, Apr 14, 2010 at 19:05
To: "michael.keith_at_ORACLE.COM" <michael.keith_at_oracle.com>
Cc: Evan Ireland <eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <
jsr-317-feedback_at_sun.com>
I can understand your point Mike, though the idea (at least as I use it) of
a MANUAL flush mode is my entity/entities being used in the wizard would
never touch a persist until the full use case is complete. IOW there would
be a bunch of POJO entities around in memory ready to be persisted, but
wouldn't be persisted until told to.
If I'm dealing with persisted entities that will be updated, then yes there
is a risk of all the transaction goodness being thrown out the window, but
like Evan was saying versioned entities, or compare all columns should take
care of that. Might need to do a merge first, or similar operation to
guarantee the data hasn't been changed since the initial find.
The point I keep thinking about: is the spec and the impementation smarter
than I am about when I want/need to persist to the database for my
application? I would really question an answer of yes. Kind of like
framework x saying it knows better than I do when to send something to a
view layer.
Sent from my iPhone
----------
From: *Gavin King* <gavin.king_at_gmail.com>
Date: Wed, Apr 14, 2010 at 19:12
To: michael.keith_at_oracle.com
Cc: Evan Ireland <eireland_at_sybase.com>, Jason Porter <
lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
I don't see how a "proper application transaction mechanism" would
solve the problem you just described, unless you're planning on
re-engineering JTA and JDBC.
JDBC will commit any SQL you send it directly either immediately (in
autocommit mode), or when the JDBC connection is committed. So
assuming that you're not going to hold the JDBC transaction open
across a user interaction (which is the whole point of the usecase)
then any "proper application transaction mechanism" you can come up
with is going to suffer from the problem that inserts/updates/deletes
executed directly against the JDBC connection are going to be
immediately executed. That's JDBC. If you want some other behavior,
you're going to have to wrap JDBC in some other technology that
implements write-behind.
Of course, the solution to the problem is to simply *not* execute
inserts/updates/deletes directly against JDBC until the end of the
application transaction. You queue them until the end of the
application transaction. It turns out that this is very easy to do
with JDBC because JDBC is a very simple technology that doesn't do too
many things automagically. The problem is that today it is NOT easy to
do using JPA (due to automatic dirty checking and automatic flushing),
and that is why people want FlushMode.MANUAL.
I don't have any problem with disguising FlushMode.MANUAL as a "proper
application transaction mechanism", but let's not deceive ourselves
that there's any deep underlying difference there. It's the same
thing, just characterized slightly differently from the user point of
view.
(That is, unless you want to re-engineer JTA/JDBC and build a JPA-ish
notion of write-behind into these technologies. Which I imagine is
pretty much not going to happen.)
--
Gavin King
gavin.king_at_gmail.com
http://in.relation.to/Bloggers/Gavin
http://hibernate.org
http://seamframework.org
----------
From: *Gavin King* <gavin.king_at_gmail.com>
Date: Wed, Apr 14, 2010 at 19:16
To: Jason Porter <lightguard.jp_at_gmail.com>
Cc: "michael.keith_at_ORACLE.COM" <michael.keith_at_oracle.com>, Evan Ireland <
eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <jsr-317-feedback_at_sun.com>
The thing is, a LOT of the time, JPA is smarter (or at least as smart)
as you. But there simply are *enough( cases where the JPA "magic"
(automatic dirty checking and automatic flushing) really do get in the
way of people who know what they are trying to do, and today JPA
offers no recourse for those people, other than "go off and work
directly against JDBC". FlushMode.MANUAL would go a *long* way to
giving you the extra control they need for the cases where JPA isn't
smarter than you.
----------
From: *Evan Ireland* <eireland_at_sybase.com>
Date: Wed, Apr 14, 2010 at 19:21
To: Gavin King <gavin.king_at_gmail.com>, michael.keith_at_oracle.com
Cc: Jason Porter <lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
I concur with this.
Assuming any JMS operations might need to be executed (in Mike's example)
one would expect they would also be deferred until the flush occurs. It
doesn't make sense to me that someone would defer an entity update until
flush, but not defer a JMS message send that is driven by the deferred
update.
----------
From: *Gavin King* <gavin.king_at_gmail.com>
Date: Wed, Apr 14, 2010 at 19:25
To: Evan Ireland <eireland_at_sybase.com>
Cc: michael.keith_at_oracle.com, Jason Porter <lightguard.jp_at_gmail.com>,
jsr-317-feedback_at_sun.com
Yeah, I wrote about JDBC, but I think the JMS case is even easier to
see. The user is deliberately delaying their flush. I don't see why
they would not also deliberately delay sending the message.
And the same reasoning applies that a "proper application transaction
mechanism" would not solve the problem unless you were able to build
some kind of "delayed send" mechanism into JMS.
----------
From: *Jason Porter* <lightguard.jp_at_gmail.com>
Date: Wed, Apr 14, 2010 at 19:31
To: Gavin King <gavin.king_at_gmail.com>
Cc: "michael.keith_at_ORACLE.COM" <michael.keith_at_oracle.com>, Evan Ireland <
eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <jsr-317-feedback_at_sun.com>
Sure, a lot of the automagic that happens in JPA (like you said dirty
checking, cascading persist / update, etc) is smarter than what I'd come up
with on my own, but like you also said, it doesn't fit all the time, there
very well may be, and certainly are, cases where it is not. As a user, I
would like to be given the chair and enough rope to hang myself should I
desire. Of course I also understand we want to protect users from doing so
most of the time, but reverting back to JDBC really shouldn't be the final
answer, IMO.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Wed, Apr 14, 2010 at 20:24
To: Jason Porter <lightguard.jp_at_gmail.com>
Cc: Evan Ireland <eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <
jsr-317-feedback_at_sun.com>
Although the optimistic locking problem is certainly accentuated it will
not cause problems in the database or in the transaction consistency. The
problem is that the persistence context is pretending to be participating
in the JTA transaction, but is not really doing so. It is enlisted, and
listening for synchronization, but unless someone happens to have manually
flushed the data contained in the PC that data will not be written out or
be committed as part of the tx that it was changed in. This violates the
rules of the tx resource contract and can cause tx inconsistency.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Wed, Apr 14, 2010 at 20:27
To: Gavin King <gavin.king_at_gmail.com>
Cc: Evan Ireland <eireland_at_sybase.com>, Jason Porter <
lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
An application transaction would solve it because it would provide correct
application transaction semantics. This is not a new concept, and has been
talked about in the past. As I mentioned, it could involve some additional
features in JTA (e.g. a nested transaction feature) as well as some
additional support from the EJB layer, but I wouldn't call it
re-engineering. Well, I know that you don't have any problem with
disguising FlushMode.MANUAL as an application transaction since that is
what you have been doing :-). I claim that there is a very real difference,
though, and described why in my last email. It simply does not exhibit
application transaction semantics in conjunction with the the rest of the
transactional resources on the server. They just aren't playing the same
transactional game.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Wed, Apr 14, 2010 at 20:37
To: Gavin King <gavin.king_at_gmail.com>
Cc: Evan Ireland <eireland_at_sybase.com>, Jason Porter <
lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
That is essentially proposing that people should not even use the existing
JTA transaction in the EJB where they are making the changes to the
entities. They should all be aware that the JTA transaction is fake and
they should queue up all of their operations to additional transactional
resources (like JMS) until some magic application signal occurs, at which
time they must let it all out. That isn't an application transaction, it's
just a description of how someone is supposed to only partly use the
existing JTA transactions, but not really use them because if they do then
they will be hosed ;-).
Maybe it comes down to what one's view of an EntityManager/persistence
context is. If someone feels that it is an application structure then it
might make sense to control whether it gets sent to the DB. However, I
think it is more of an integrated framework with the system, if you don't
want to write something out then don't keep it in the transactional
persistence context (that is advertised as being integrated with JTA) when
the transaction commits.
----------
From: *Evan Ireland* <eireland_at_sybase.com>
Date: Wed, Apr 14, 2010 at 20:44
To: michael.keith_at_oracle.com, Jason Porter <lightguard.jp_at_gmail.com>
Cc: jsr-317-feedback_at_sun.com
Mike,
Not writing data does not cause tx inconsistency. A read-only transaction
for example doesn't write data, and that doesn't cause tx inconsistency.
If we were to allow FlushMode.MANUAL, we would also need to stipulate that
if you delete or update (via flush) a versioned object in one transaction
that was loaded in a previous transaction, that an appropriate version check
is done by the flushing transaction. That version check (if done properly)
will ensure repeatable read semantics, just as if the data had been loaded
in the current (flushing) transaction.
You might want to point out an actual anomaly that would be possible in this
case rather than claiming inconsistency without supporting evidence.
> -----Original Message-----
> From: Mike Keith [mailto:michael.keith_at_oracle.com]
----------
From: *Emmanuel Bernard* <ebernard_at_redhat.com>
Date: Thu, Apr 15, 2010 at 01:54
To: Evan Ireland <eireland_at_sybase.com>
Cc: Gavin King <gavin.king_at_gmail.com>, michael.keith_at_oracle.com, Jason
Porter <lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
Note that if we add a post flush event listener concept in JPA, we can
indeed control when to release all these heterogeneous operations.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Thu, Apr 15, 2010 at 06:36
To: Evan Ireland <eireland_at_sybase.com>
Cc: Jason Porter <lightguard.jp_at_gmail.com>, jsr-317-feedback_at_sun.com
Evan, I'm not sure why you keep bringing up entity versioning issues. There
are no problems with entity versioning, and even though keeping them cached
for longer before writing them out will increase the likelihood of getting
an optlock exception our version checking will still work. I am not talking
about data inconsistency, but transactional inconsistency or integrity due
to the atomicity property being violated. I gave evidence in my previous
email when I described a JMS message that got written and commmitted, but
the persistence context writes were suppressed so data modified in the same
transaction did not get committed.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Thu, Apr 15, 2010 at 07:18
To: Jason Porter <lightguard.jp_at_gmail.com>
Cc: Evan Ireland <eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <
jsr-317-feedback_at_sun.com>
I just thought of another potential solution to this problem. Currently we
generally require that container-managed persistence contexts be
synchronized with the transaction. We allow regular transactional PCs, or
PCs of the extended variety, but if we added a new option for
non-synchronized PCs then all of this stuff could occur outside of the JTA
transactions and then when the time comes the joinTransaction call could be
used (there are already cases when it is required of C-M PCs) to register
the PC with the transaction. This might solve the problem for everybody so
that the objects stay managed, multiple JTA transactions may occur and
transactional integrity is not compromised in any way because the EM is
essentially non-transactional while it is non-synchronized.
One could achieve this by doing something like the following:
@PersistenceContext(type=NON_**SYNCHRONIZED)
EntityManager em;
Anyway, my point is that I believe there are better ways to solve this
problem (that clearly needs to be solved) and I am happy to include it on
our list if other members of the group are as well.
----------
From: *Jason Porter* <lightguard.jp_at_gmail.com>
Date: Thu, Apr 15, 2010 at 07:53
To: "michael.keith_at_oracle.com" <michael.keith_at_oracle.com>
Cc: Evan Ireland <eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <
jsr-317-feedback_at_sun.com>
That sounds like a reasonable solution.
Sent from my iPhone
----------
From: *Emmanuel Bernard* <ebernard_at_redhat.com>
Date: Thu, Apr 15, 2010 at 09:35
To: michael.keith_at_oracle.com
Cc: Jason Porter <lightguard.jp_at_gmail.com>, Evan Ireland <
eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <jsr-317-feedback_at_sun.com>
If in the NON_SYNCHRONIZED semantic we also imply that flush must not occur
then that could work. But from your wording it seems that only the Tx
registerSync would be affected which unfortunately is not enough as the PC
could decide to do early flushed and ruin the whole party.
----------
From: *Gordon Yorke* <gordon.yorke_at_oracle.com>
Date: Thu, Apr 15, 2010 at 09:50
To: Emmanuel Bernard <ebernard_at_redhat.com>
Cc: michael.keith_at_oracle.com, Jason Porter <lightguard.jp_at_gmail.com>, Evan
Ireland <eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <
jsr-317-feedback_at_sun.com>
**
This EntityManager would not be in a transaction so the current rules would
apply (TransactionRequiredException).
We would also want to follow the propagation rules of an Extended
Persistence Context.
--Gordon
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Fri, Apr 16, 2010 at 07:00
To: Emmanuel Bernard <ebernard_at_redhat.com>
Cc: Jason Porter <lightguard.jp_at_gmail.com>, Evan Ireland <
eireland_at_sybase.com>, "jsr-317-feedback_at_sun.com" <jsr-317-feedback_at_sun.com>
Yes, that was the point. A manual flush mode could make sense in this new
kind of persistence context.
----------
From: *Mike Keith* <michael.keith_at_oracle.com>
Date: Fri, Apr 16, 2010 at 07:16
To: Gordon Yorke <gordon.yorke_at_oracle.com>
Cc: Emmanuel Bernard <ebernard_at_redhat.com>, Jason Porter <
lightguard.jp_at_gmail.com>, Evan Ireland <eireland_at_sybase.com>, "
jsr-317-feedback_at_sun.com" <jsr-317-feedback_at_sun.com>
**
The TransactionRequiredException is conveniently defined to be thrown on a
persistence context type of TRANSACTION, so operations like persist() will
be fine on a NON_SYNCHRONIZED type. Agreed on the propagation.
We can discuss the details, though, when the group starts up in earnest.
--
Jason Porter
http://lightguard-jp.blogspot.com
http://twitter.com/lightguardjp
Software Engineer
Open Source Advocate
Author of Seam Catch - Next Generation Java Exception Handling
PGP key id: 926CCFF5
PGP key available at: keyserver.net, pgp.mit.edu