[jsr356-experts] Re: RemoteEndpoint setAutoFlush() and flush()

From: Scott Ferguson <ferg_at_caucho.com>
Date: Fri, 14 Dec 2012 19:22:48 -0800

On 12/14/12 5:13 PM, Danny Coward wrote:
> Hi Scott,
>
> On 12/11/12 11:37 AM, Scott Ferguson wrote:
>>
>> It doesn't work because the *ByFuture and *ByCompletion are single
>> item queues. You can't batch or queue more than one item with those
>> APIs. If you look at java.nio.channels.AsynchronousChannel, it says
>>
>> " Some channel implementations may support concurrent reading and
>> writing, but may not allow more than one read and one write operation
>> to be outstanding at any given time."
>>
>> Since it's only a single-item queue, the implementation can't batch
>> the items -- there's only one item.
> Right, but are our APIs are only expressed in terms of the data
> objects not the channels. Does it really lock you into using
> AsynchronousChannel with its one-write-at-a-time rule ?

Not directly, but the same issues apply in our case as theirs, so I'm
pretty sure we want to follow their one-write-at-a-time.

>
>>
>> And it's a single-item queue because multiple-item queues require
>> more API methods, like in BlockingQueue, and a longer spec definition
>> to describe the queue behavior, e.g. what happens when the queue is
>> full or even what "full" means.
> I'm probably being really stupid, but can't an implementation use a
> BlockingQueue under our APIs, and determine itself based on a
> knowledge of its own implementation environment when to send a batch
> of messages ?

Unfortunately, no. I tried working through that exercise, but it runs
into problems.

1) If the user is allowed to call more than once (i.e. queue messages),
then either the queue size must be infinite, or the "async" call would
eventually block (put), or throw an exception (add), or return a boolean
(offer). None of those are good default options, because none matches
the notion of async.

Basically, an API that acts like queue needs all 3 main offer/put method
types that blocking queue provides, or be infinite, or select one of the
methods as a default behavior. I'm pretty sure we don't want to do any
of those.

2) The life of the passed objects gets a bit trickier for the sender,
because it needs to keep track of all the return callbacks for every
item in to queue to know when it can reuse the byte[], or ByteBuffer or
Object. One item is reasonable to keep track of, but multiple items are
probably an unfair burden on the user.

Basically, the behavior veers off into under-defined problems if queues
are allowed for that call, but only partially defined.

And, just to be clear 3) the implementation doesn't want a timed wait
(like tcp nagle), because timer overhead is high.
> Its a bit tricky imposing a different development model on top of what
> we have, especially because I'll bet there will be some
> implementations that will not support batching.

I don't think I understand. I don't think this would be a different
development model.

Any websocket implementation can always ignore the auto-flush=false and
send the message without delay. That would be perfectly spec-compliant
behavior.

The application code would work in both cases; it would just be faster
in one that supported batching.

> I have some ideas on a subtype of RemoteEndpoint which might separate
> out the batching model better than the flags and the flush(), but lets
> see.

Cool! I'm looking forward to it.

-- Scott
>
> I'm flagging this in the spec for v10 because the spec has not
> resolved this yet.
>
> - Danny
>
>>
>> -- Scott
>>
>>
>>
>>
>>>
>>>
>>> - Danny
>>>
>>>
>>>
>>> On 11/29/12 12:11 PM, Scott Ferguson wrote:
>>>> On 11/29/12 11:34 AM, Danny Coward wrote:
>>>>> My apologies Scott, I must have missed your original request -
>>>>> I've logged this as issue 63.
>>>>
>>>> Thanks.
>>>>
>>>>>
>>>>> So auto flush true would require the implementation never keep
>>>>> anything in a send buffer, false would allow it ?
>>>>
>>>> Not quite. It's more like auto-flush false means "I'm batching
>>>> messages; don't bother sending if you don't have to." I don't think
>>>> the wording should be "never", because of things like mux, or other
>>>> server heuristics. It's more like "start the process of sending."
>>>>
>>>> setBatching(true) might be a better name, if that's clearer.
>>>>
>>>> When setBatching(false) [autoFlush=true] -- the default -- and an
>>>> app calls sendString(), the message will be delivered (with
>>>> possible buffering, delays, mux, optimizations, etc, depending on
>>>> the implementation, but it will be delivered without further
>>>> intervention from the app.)
>>>>
>>>> When setBatching(true) [autoFlush=false], and an app calls
>>>> sendString(), the message might sit in the buffer forever until the
>>>> application calls flush().
>>>>
>>>> sendPartialString would be unaffected by the flag; the WS
>>>> implementation is free to do whatever it wants with partial messages.
>>>>
>>>> Basically, it's a hint: setBatching(true) [autoFlush=false] means
>>>> "I'm batching a bunch of messages, so don't bother sending the data
>>>> if you don't need to until I call flush."
>>>>
>>>> Does that make sense? I don't want to over-constrain
>>>> implementations with autoFlush(true) either option. Maybe
>>>> "batching" is the better name to avoid confusion. (But even
>>>> batching=true doesn't require buffering. Implementations can still
>>>> send fragments early if they want or even ignore batching=true.)
>>>>>
>>>>> It seems like a reasonable request - do you think the autoflush
>>>>> property is a per-peer setting / per logical endpoint / per
>>>>> container setting ? I'm wondering if typically developers will
>>>>> want to set this once per application rather than keep setting it
>>>>> per RemoteEndpoint.
>>>>
>>>> I think it's on the RemoteEndpoint, like setAutoCommit for JDBC.
>>>> It's easy to set in @WebSocketOpen, and the application might want
>>>> to start and stop batching mode while processing.
>>>>
>>>> -- Scott
>>>>
>>>>>
>>>>> - Danny
>>>>>
>>>>> On 11/28/12 3:28 PM, Scott Ferguson wrote:
>>>>>>
>>>>>> I'd like a setAutoFlush() and flush() on RemoteEndpoint for high
>>>>>> performance messaging. Defaults to true, which is the current
>>>>>> behavior.
>>>>>>
>>>>>> The performance difference is on the order of 5-7 times as many
>>>>>> messages in some early micro-benchmarks. It's a big improvement
>>>>>> and puts us near the high-speed messaging like ZeroQ.
>>>>>
>>>>>
>>>>> --
>>>>> <http://www.oracle.com> *Danny Coward *
>>>>> Java EE
>>>>> Oracle Corporation
>>>>>
>>>>
>>>
>>>
>>> --
>>> <http://www.oracle.com> *Danny Coward *
>>> Java EE
>>> Oracle Corporation
>>>
>>
>
>
> --
> <http://www.oracle.com> *Danny Coward *
> Java EE
> Oracle Corporation
>