[jsr356-experts] Re: RemoteEndpoint setAutoFlush() and flush()

From: Scott Ferguson <ferg_at_caucho.com>
Date: Tue, 11 Dec 2012 11:37:29 -0800

On 12/11/12 10:25 AM, Danny Coward wrote:
> Hi Scott,
>
> OK, I think I understand. So the idea is to allow implementations to
> send messages in a batch in order to get a big performance gain for
> applications that send a lot of messages in a short amount of time and
> to allow an explicit way for developers to take advantage of that, if
> the batching optimization is in the implementation.

Exactly.

>
> And I think with the flush() method, we would have allowed containers
> who choose to do batching under the existing model without the extra
> setBatching/setAutoflush() idea ?

Only if we always require a flush. We could do that. That's the
equivalent of auto-flush=false always, and since it's how
BufferedOutputStream works, it's an existing programming model.

If a developer forgets a flush, the message might never get sent.

I'm a bit wary of that definition, because some implementations won't
bother with buffering, and lazy programmers will forget the flush, but
it will work anyway, and the spec will eventually revert to auto-flush
after the lazy programmers complain about compatibility.

>
> I think that sort of approach already fits under the async model we
> have: the async send operations allow implementations to make their
> own choice about when to send the message after the async send has
> been called. i.e.
>
> sendString/sendBytes - send the message now (no batching)
> sendStringByFuture() - send the message when the container decides to
> (possibly batching if it chooses to)

That doesn't work, but the reason is a bit complicated (see below).
(Secondarily, the *ByFuture is a high-overhead API, which doesn't work
well in high-performance.)

It doesn't work because the *ByFuture and *ByCompletion are single item
queues. You can't batch or queue more than one item with those APIs. If
you look at java.nio.channels.AsynchronousChannel, it says

" Some channel implementations may support concurrent reading and
writing, but may not allow more than one read and one write operation to
be outstanding at any given time."

Since it's only a single-item queue, the implementation can't batch the
items -- there's only one item.

And it's a single-item queue because multiple-item queues require more
API methods, like in BlockingQueue, and a longer spec definition to
describe the queue behavior, e.g. what happens when the queue is full or
even what "full" means.

-- Scott

>
>
> - Danny
>
>
>
> On 11/29/12 12:11 PM, Scott Ferguson wrote:
>> On 11/29/12 11:34 AM, Danny Coward wrote:
>>> My apologies Scott, I must have missed your original request - I've
>>> logged this as issue 63.
>>
>> Thanks.
>>
>>>
>>> So auto flush true would require the implementation never keep
>>> anything in a send buffer, false would allow it ?
>>
>> Not quite. It's more like auto-flush false means "I'm batching
>> messages; don't bother sending if you don't have to." I don't think
>> the wording should be "never", because of things like mux, or other
>> server heuristics. It's more like "start the process of sending."
>>
>> setBatching(true) might be a better name, if that's clearer.
>>
>> When setBatching(false) [autoFlush=true] -- the default -- and an
>> app calls sendString(), the message will be delivered (with possible
>> buffering, delays, mux, optimizations, etc, depending on the
>> implementation, but it will be delivered without further intervention
>> from the app.)
>>
>> When setBatching(true) [autoFlush=false], and an app calls
>> sendString(), the message might sit in the buffer forever until the
>> application calls flush().
>>
>> sendPartialString would be unaffected by the flag; the WS
>> implementation is free to do whatever it wants with partial messages.
>>
>> Basically, it's a hint: setBatching(true) [autoFlush=false] means
>> "I'm batching a bunch of messages, so don't bother sending the data
>> if you don't need to until I call flush."
>>
>> Does that make sense? I don't want to over-constrain implementations
>> with autoFlush(true) either option. Maybe "batching" is the better
>> name to avoid confusion. (But even batching=true doesn't require
>> buffering. Implementations can still send fragments early if they
>> want or even ignore batching=true.)
>>>
>>> It seems like a reasonable request - do you think the autoflush
>>> property is a per-peer setting / per logical endpoint / per
>>> container setting ? I'm wondering if typically developers will want
>>> to set this once per application rather than keep setting it per
>>> RemoteEndpoint.
>>
>> I think it's on the RemoteEndpoint, like setAutoCommit for JDBC. It's
>> easy to set in @WebSocketOpen, and the application might want to
>> start and stop batching mode while processing.
>>
>> -- Scott
>>
>>>
>>> - Danny
>>>
>>> On 11/28/12 3:28 PM, Scott Ferguson wrote:
>>>>
>>>> I'd like a setAutoFlush() and flush() on RemoteEndpoint for high
>>>> performance messaging. Defaults to true, which is the current
>>>> behavior.
>>>>
>>>> The performance difference is on the order of 5-7 times as many
>>>> messages in some early micro-benchmarks. It's a big improvement and
>>>> puts us near the high-speed messaging like ZeroQ.
>>>
>>>
>>> --
>>> <http://www.oracle.com> *Danny Coward *
>>> Java EE
>>> Oracle Corporation
>>>
>>
>
>
> --
> <http://www.oracle.com> *Danny Coward *
> Java EE
> Oracle Corporation
>