jsr343-experts@jms-spec.java.net

[jsr343-experts] Re: Batch processing...

From: Clebert Suconic <clebert.suconic_at_gmail.com>
Date: Fri, 29 Jul 2011 10:34:46 -0500

Comments bellow:
> With no timeout then we will get something like:
>
> (no wait)
> onMessages(first 20 messages)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> etc etc
>
> You'll see that this could end up with no batching at all.
>


I agree the timeout is needed, but there are certain use cases where
the user doesn't want a timeout.

The user may want to process as soon as possible anything that's on
the buffer. But on that case the user can just configure timeout = 0.

The user may expect the message consumer being busy processing the
messages. So, he configured the consumer to get anything that is on
the buffer, and it was produced while the user was processing the last
batch.

It would be just a configuration option for the user.

Most users will use a timeout, but some will prefer having a timeout.




Like, on this real use case:


I know an user who is processing SMS Messages sent during a Reality
Show. The viewers will be sending SMS for voting on who should leave
the show.. that kind of thing.

They need at a certain point to aggregate and add the information on
Databases at some point.


The DB (Oracle) would keep up well with inserting data, but it would
be impossible for the DB (and hardware) to process the db.commits
properly due to hardware syncs.. etc.

So, they commit in batches of 5000 messages (both on DB and on JMS).


i.e., they do something like:

while (...)
{
    Message msg =consumer.receive(500); // BTW if the user wanted no
timeout, he could user eceiveImmediate here
    if (msg == null)
    {
         // It mean timeout has achieved.. we will do the commit here as well
    }
    count++;
    aggregation += getSomeDataFromMessage(msg);
    insertSomeDataOnDB();
    if (count %1000 == 0)
    {
         updateAggregation();
         commitDB();
    }
}

They can't use MDBs for this either. My expectation is that the EJB
spec will get this as part of their spec as well.

(Another subject is that the MDB API sucks.. it should be a single
API.. or super class for JMS and EE IMO). But that's another story.