jsr343-experts@jms-spec.java.net

[jsr343-experts] Re: Batch processing...

From: emran <emrans_at_pramati.com>
Date: Thu, 28 Jul 2011 20:46:07 +0530

My comments inlined:

----- Original Message -----
From: "Nigel Deakin" <nigel.deakin_at_oracle.com>
To: jsr343-experts_at_jms-spec.java.net
Sent: Thursday, July 28, 2011 7:42 PM
Subject: [jsr343-experts] Re: Batch processing...


> Comments below
>
> On 27/07/2011 16:26, Clebert Suconic wrote:
>> I will answer with what is IMHO:
>>
>>> 1a. If the destination contains, say, 20 messages, does it deliver 20
>>> messages or does it wait for a further 30 to arrive before calling
>>> onMessages()?
>>>
>>
>> The implementation would be able to make a callback to the server and
>> verify if the server's destination is empty, and flush it case it's
>> empty.
>>
>> We could add support for an optional timeout argument. Say, you would
>> flush if the client buffer didn't get a message in X milliseconds (or
>> seconds).
>
> This isn't about how long we wait for an individual message, it's how long
> the JMS provider should wait for the a full batch of messages to arrive on
> the destination. If the timeout is reached, it delivers whatever it has so
> far.
>

I see an issue here. If the JMS provider attaches the timeout to the
destination, I think we will not be able to handle the cases where multiple
consumers specifying multiple timeout values. IMHO, JMS provider should keep
pushing messages to the "consumer objects" on the server side, and these
"consumer objects" should apply the timeout and further push the messages to
the client-side consumer. The example scenarios that you mentioned below
would still hold good with this.

> Imagine a batch size of 50. Let's also imagine that when we create the
> consumer and register the listener there are 20 messages already on the
> queue, with further messages being added every 100ms
>
> With no timeout then we will get something like:
>
> (no wait)
> onMessages(first 20 messages)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> (100ms elapses)
> onMessages(1 message)
> etc etc
>
> You'll see that this could end up with no batching at all.
>
> If we define a timeout of 50ms then we will get something like:
>
> (wait for 50ms)
> onMessages(first 20 messages)
> (wait for 50ms)
> (wait for 50ms)
> onMessages(1 message)
> (wait for 50ms)
> (wait for 50ms)
> onMessages(1 message)
> (wait for 50ms)
> (wait for 50ms)
> onMessages(1 message)
> etc etc
>
> If we define a timeout of 1s then we will get something like:
>
> (wait for 1 sec)
> onMessages(first 30 messages)
> (wait for 1 sec)
> onMessages(10 messages)
> (wait for 1 sec)
> onMessages(10 messages)
> (wait for 1 sec)
> onMessages(10 messages)
> etc etc
>
> If we define a timeout of 3s then we will get something like:
>
> (wait for 3 secs)
> onMessages(50 messages)
> (wait for 3 secs)
> onMessages(30 messages)
> (wait for 3 secs)
> onMessages(30 messages)
> (wait for 3 secs)
> onMessages(30 messages)
> etc etc
>
> If we define a timeout of 10s then we will get something like:
>
> (wait for 3 secs)
> onMessages(50 messages)
> (wait for 5 secs)
> onMessages(30 messages)
> (wait for 5 secs)
> onMessages(30 messages)
> (wait for 5 secs)
> onMessages(30 messages)
> etc etc
>
> Is this what you have in mind?
>
>>
>>> 1b. If the destination is initially empty, and messages are being addded
>>> to
>>> the destination at 10 messages every second, how many messages are
>>> delivered
>>> in each batch?
>>>
>>
>> Same as 1a. If we add the support for a timeout it would be up to the
>> user what to do. Either flush the current buffer or wait it complete
>> with 50 messages.
>>
>>> 2. Do you care whether these messages are consecutive? (if there were
>>> more
>>> than one consumer on the same queue then they might not be)
>>>
>>
>> I think we should keep the same semantics currently all the providers
>> (I know) have,. i.e. Messages are in order but not required to be
>> consecutive. (if more than one consumer on the queue). We just need to
>> keep the same semantic as we had with a single call to onMessage.
>
> OK. I would agree.
>
>>
>>
>>> 3. I note you say that auto-ack would ack them all in a single batch.
>>> Were
>>> you expecting client-ack and local transactions to behave as now (so a
>>> call
>>> to acknowledge() or commit() would ack or commit all the messages
>>> delivered
>>> by the session, not just those in the batch), or were you thinking of
>>> something different?
>>

If we are sending all acks in a single batch, we might have to add more
clarification regarding duplicate delivery of messages to the consumers
(section 4.4.12). Right now this section says that, since the client is not
in direct control of acks with AUTO_ACK, the last message consumed by the
session could get redelivered (due to failures). With batch processing, the
whole batch could get redelivered. We might have to clarify this.

>> Same thing.. We are just merging several onMessage calls into a single
>> call.
>>
>> That means this for client ACK:
>>
>> onMessage(Message[] messages)
>> {
>> messages[20].acknowledge(); // it would ack up to the message[20]
>> + anything already acked on the session.
>> }
>
> Actually, Message.acknowledge() acknowledges "all consumed messages of the
> session". It's irrelevant what Message object is used. (This is get
> another confusing aspect of the JMS API).
>
> Nigel
>

Regards,
Emran.