Hi bobxu,
As you probably noticed your question is a bit too general to give a good answer. If your question is about choosing between a JPA (ORM) or JDBC I think you first have to ask yourself whether your domain is well suited for an object oriented approach. This could be the case when you have many entities with many complex relationships. Or the other way around, if you are using jdbc simply to populate your objects. If this is the case, you are probably better of using an orm for the following reasons:
1) The orm handles all the plumbing code of persisting your object model to the database, which saves you quite some time;
2) All communication with the datastore is done in a consistent and probably thoroughly tested way;
3) The added layer of abstraction helps you creating complex queries. You probably think of queries you want to execute against certain objects, the orm will translate these for you in possibly multiple queries to retrieve all the actual data.
However, since the JPA is an added layer of abstraction I think the biggest problem you will encounter is that you know how to optimize a certain query, but JPA doesn’t let you because it has not provided that feature yet. Maturity of the orm in terms of features you use is therefore very important. I personally use JDO to do all the persistence. It has been around since 2000 and has since then grown very mature. But to give you an example which I encountered regarding performance, in JDO1 you could only explicitly delete objects. This is of course a major performance hit since you first have to retrieve the object before you can delete. Therefore in JDO2, you can use a jdo query which allows you to delete the objects without retrieving and instantiating them.
Now in your case I think it is essential you test how JPA (or any other orm) handles millions of data, both performance an memory wise. Or better, can you tune it as such that you can find a good balance between performance and memory usage. To give an example, is it possible to configure whether size() of a collection (or query result) uses a count(*) to determine this or simply scroll to the last row in your jdbc queries. The first one is better suited for very large collections, the last one is probably faster for small queries.
Hope this helps in decision making,
Christiaan
[Message sent by forum member 'christiaan_se' (christiaan_se)]
http://forums.java.net/jive/thread.jspa?messageID=286141