users@glassfish.java.net

Re: Running Java Applications as a back-end batch processes

From: <glassfish_at_javadesktop.org>
Date: Wed, 16 Jul 2008 10:28:35 PDT

> Be careful with this solution. It's not as easy as is
> said. I had a lot of problems with it. If you are
> going to do batch processing with EJB timers, you
> have to be very careful. EJB timer allows you to call
> EJB stateless and statefull session beans and they
> does not allow you to access files. Most of the time
> patch processing is dealing with files.

I call shenanigans.

Yes, the JEE spec considers file access by an EJB to be "unsupported" in terms of the spec. However, I don't know of a single container that actually enforces this restriction. The motivation behind it is that EJB is a distributed component framework, and in that light the intent is that EJBs can be deployed "anywhere", particularly in a clustered environment. In these cases, relying on the filesystem means you're relying on a "resource" that is not managed by the container. If it's not managed by the container, it's not "portable". If you want to be pure to spec, write a JCA resource to represent your file system.

But the reality is simply that most developers can actually work around the details of "knowing" where a component is deployed (perhaps it's not a clustered system, perhaps it's a configuration variable specifying a correct location, perhaps it's a network filesystem mapped to the path on every node).

Bottom line, files work fine in EJB. It's just not, per spec, "portable".

> Another issue, if you are going to move big amounts
> of data between files or different databases you will
> have long transaction. All data in that transaction
> will be collected in memory and released only after
> completion. You have to know amount of data and how
> much memory will be consumed during this process.

More shenanigans. Regarding databases specifically, I don't know of a single modern database who has transactions that are memory bound. Typical databases will have transactions log bound, assuming fixed log sizes configured for the database. There's no reason for a database to hold pending transactions in memory. It's insane, it has to write the data anyway.

Should you be concerned how much memory your processes take? Of course, but the database won't be your smoking gun and neither will the EJB transaction subsystem. It doesn't "remember" anything either.

Most problems with long running transactions in EJB systems revolve around network connection timeouts from clients. EJB Timers don't suffer that problem since they're local to the container (unless they themselves are network clients, for example calling a long running remote EJB session bean).

> Development of batch processing tasks on glassfish
> was problematic and I got a lot of headache and
> sleepless nights, but at the end it works very well
> and I'm happy how stable and reliable it's.

I'm glad he's happy with his Spring solution, but I'm very interested in more specifics about what his issues were.
[Message sent by forum member 'whartung' (whartung)]

http://forums.java.net/jive/thread.jspa?messageID=287100