Implementation Improvements in Entity-Persistence

From: Markus KARG <markus.karg_at_gmx.net>
Date: Tue, 03 Oct 2006 12:37:51 +0200

I have found some potential implementation and code style improvements
in entity-persistence that the GF team might be interested in. Please
not that I don't want anyone to blame or be offensive. I'm a funny guy
and most of the following is sarcasm. But it's no joke, it's real ideas
for improvement.

(1) Do not have vendor specifics in DatabasePlatform.java

    /**
     * Some database require outer joins to be given in the where
clause, others require it in the from clause.
     * Informix requires it in the from clause with no ON expression.
     */
    public boolean isInformixOuterJoin() {
        return false;
    }

That code is found in DatabasePlatform.java. In fact, the design of
TopLink should be unaware of the INFORMIX product or its special
problems. It's no good design to make all platform implementations know
a method named "isInformatixOuterJoin".

(2) Don't ask for behaviour but delegate business!

I have seen that TopLink is doing a lot of heavy work to find out how
simple statements are written in the specific SQL dialect. For that, a
lot of "supports" and "getPrefix" stuff is called. Actually this makes
it quite hard to implement new platforms. I don't understand why TopLink
likes to know as much about the underlying platform instead of just
delegating business to the platform implementation itself (e. g. by
using SQL Templates internally instead of putting together SQL at
runtime by querying lots of methods).

It would be much, much easier (maybe faster executing, too?) to simplify
the interface DatabasePlatform by viewing it from a higher level of
view. Just name the methods by the business you want it to do, but do
not name it by the behaviour of the underlying implementation.

For example, do not have methods like "supports", "getPrefix", etc. that
are querying the behaviour of the underlying platform. Instead have
methods like those, issuing a clear command to the underlying platform:

String Database.createTable(TableDefinition table);
Integer Sequence.getNextValue();

(The actual assembly of SQL templates and its execution is delegated to
the platform implementation).

This is easier to write, easier to read, easier to test, easier to
understand, easier to maintain... So why using complexity when there is
no need for?

If you decide NOW to change the interface, its an effort of days to
overhaul the existing implementations but you will have new database
implemented by one day each. If you keep the current interface, it will
be several days each. The more implementations exist, the lower you
chance is to simplify or refactor the interface in future.

(3) Do not use Hastable if synchronization is not an issue

You're using Hashtable for several maps (e. g. SQL type names etc.).
Actually those types never will get changed at runtime, so
synchronization is no issue. So I do not understand why you still cope
with synchronized Hashtable when there is a Collection framework since
Java 1.3? Please remove Hashtable from your Java brain. ;-)

Since Java 1.3 the typical hash table will be created by

final Map keys = new HashMap();

instead of

Hashtable keys = new Hashtable();

If you like to know why, check the release notes of Java 1.3, or check
Collection API javadocs. ;-)

(4) Do you like overwriting results incidentially?

Is it intention that you leave all variables and method arguments as
writeable, while almost all of them are never overwritten? For example,
all arguments to parameters are not final, but are never overwritten.
This opens the possibility of incidentially overwriting the content.
There is final keyword been invented years ago. So why not using it to
prevent that potential failure? This kind of failure is one of the most
hard to find ones. So as you will mark anything as private, you should
mark any parameters as final, unless you actually want to overwrite it
(BTW, Fowler et al say, do not overwrite them but keep them final and
invent another variable with a new name for the overwritten value).

(5) Do you think, Single Line Blocks are too short?

Don't you like the shortness of single line blocks? Or are we all Java
beginners? ;-) Otherwise I cannot understand why you keep writing the
ending bracket in single-line blocks (e. g. FORs and IFs). This makes
the code harder to read and longer than needed. Java allows omitting
trailing brackets in single line blocks by intention, not by fault. ;-)

(6) Do you hate Shortcut Operators?

Do you love long code lines? Or are we all Java beginners? ;-) Otherwise
I cannot understand why you write this:

nBoundParameters = nBoundParameters + appendParameterInternal(call,
writer, new Integer(theObjects[i]));

instead of that:

nBoundParameters += appendParameterInternal(call, writer, new
Integer(theObjects[i]));

(7) Explicit Names

Do you like cryptic names? Or what is the reason why you are using
"nBoundParameters" instad of "boundParametersCount"?

One need to guess what the n means while it is clear what Count means.
Also one could think that "nBoundParameters" is a typo form of
"unboundParameters" (what would be the reverse). Clear names are best
for everyone. Cryptic or abbreviated names are best for the original
authors only.

(8) Did you know that some people are using IDEs that do not mark static
methods by another colour / font?

You are neither using the this prefix nor the class prefix. As a result,
people not using Eclipse IDE etc. cannot see whether
appendParameterInternal(...) is a static or non-static method. That's
why some IDEs ask you to add either the this prefix or the class prefix.
Don't you want those users without Eclipse to be able to read and
understand your code easily? Or is there another good reason not to add
the this or class prefix?

(9) DatabasePlatform implements DatabasePlatform

I don't know the historical reason for that, but from a design aspect
this is not very nice. It would be a much more clean design to have:

interface DatabasePlatform {...}

abstract class AbstractDatabasePlatform implements DatabasePlatform {...}

final class DefaultDatabasePlatform extends AbstractDatabasePlatform {...}

final class MaxDBPlatform extends AbstractDatabasePlatform {...}

The automatic driver detection will then use either MaxDB etc. if the
driver got detected, otherwise would use DefaultDatabasePlatform. The
names / design is so clear that everyone would understand that, why
actually till today I do not understand why there is the current design
and what the difference between those two DatabasePlatform classes is...

(10) Delegate driver detection to the platform

It's unclear why the mapping of regex and driver class is in a
properties file. If you want to learn GF about new platforms, you need
to compile entity-persistence anyways, so you can just turn that
properties file into a class "PlatformDetector" (which will run much
faster than interpreting a properties file).

But a much better idea is to not have "PlatformDetector" at all but
instead delegate the detection into the platform itself. When booting
GF, just create an instance of each implementation of DattabasePlatform
found (since GF needs several seconds to boot, and since a platform
doesn't have a large footprint, this will not hurt much). When a
connection factory is set up, provide the connection to each platform
("public boolean isSupported(Connection);"). The platform will tell you
whether the connection is supported or not. Then you know which platform
to use with that connection. The platform internally can use any
mechanism it likes to find out whether to support that connection. You
are not limited to regex and DatabaseProductName then.

There might be more ideas while I stepped further into DatabasePlatform.
Stay tuned for more "code blasphemy" soon. >;-)

Have Fun
Markus

application/x-pkcs7-signature attachment: S/MIME Cryptographic Signature