jsr356-experts@websocket-spec.java.net

[jsr356-experts] Re: RemoteEndpoint setAutoFlush() and flush()

From: Scott Ferguson <ferg_at_caucho.com>
Date: Mon, 14 Jan 2013 08:44:21 -0800

On 12/20/12 10:44 AM, Danny Coward wrote:
> Hi Scott,
>
> OK, well, in order that we can get wider feedback, I've put the
> setAutoBatching() and flush() into the RemoteEndpoint API, which gives
> developers a mode to take advantage of this on containers that support
> it, while still remaining portable on containers that do not support
> this batching optimisation.

I like it.
>
> The refactoring I was thinking about was essentially to introduce
> different kinds of RemoteEndpoint to separate out what are essentially
> different programming modes for sending. Something like this:-
>
> RemoteEndpoint.Basic - synchronous send of whole messages
> RemoteEndpoint.Async - asynchronous send of whole messages
> RemoteEndpoint.Stream - sending messages using blocking I/O and/or
> partial messages
> RemoteEndpoint.Batch - sending messages that may be batched
>
> and have corresponding methods on Session to get the RemoteEndpoint
> that supports the mode you want to you.

That's a tricky call between one big interfaces and lots of smaller ones.

I think the current one is the easier, because it doesn't have too many
methods (about 20) and because its easier to scan one javadoc rather
than switching modes and scanning multiple javadocs. But it's a
judgement call.

-- Scott

>
> But, I will think though a fully worked out API diff in the coming weeks.
>
> - Danny
>
>
> On 12/14/12 7:22 PM, Scott Ferguson wrote:
>> On 12/14/12 5:13 PM, Danny Coward wrote:
>>> Hi Scott,
>>>
>>> On 12/11/12 11:37 AM, Scott Ferguson wrote:
>>>>
>>>> It doesn't work because the *ByFuture and *ByCompletion are single
>>>> item queues. You can't batch or queue more than one item with those
>>>> APIs. If you look at java.nio.channels.AsynchronousChannel, it says
>>>>
>>>> " Some channel implementations may support concurrent reading
>>>> and writing, but may not allow more than one read and one write
>>>> operation to be outstanding at any given time."
>>>>
>>>> Since it's only a single-item queue, the implementation can't batch
>>>> the items -- there's only one item.
>>> Right, but are our APIs are only expressed in terms of the data
>>> objects not the channels. Does it really lock you into using
>>> AsynchronousChannel with its one-write-at-a-time rule ?
>>
>> Not directly, but the same issues apply in our case as theirs, so I'm
>> pretty sure we want to follow their one-write-at-a-time.
>>
>>>
>>>>
>>>> And it's a single-item queue because multiple-item queues require
>>>> more API methods, like in BlockingQueue, and a longer spec
>>>> definition to describe the queue behavior, e.g. what happens when
>>>> the queue is full or even what "full" means.
>>> I'm probably being really stupid, but can't an implementation use a
>>> BlockingQueue under our APIs, and determine itself based on a
>>> knowledge of its own implementation environment when to send a batch
>>> of messages ?
>>
>> Unfortunately, no. I tried working through that exercise, but it runs
>> into problems.
>>
>> 1) If the user is allowed to call more than once (i.e. queue
>> messages), then either the queue size must be infinite, or the
>> "async" call would eventually block (put), or throw an exception
>> (add), or return a boolean (offer). None of those are good default
>> options, because none matches the notion of async.
>>
>> Basically, an API that acts like queue needs all 3 main offer/put
>> method types that blocking queue provides, or be infinite, or select
>> one of the methods as a default behavior. I'm pretty sure we don't
>> want to do any of those.
>>
>> 2) The life of the passed objects gets a bit trickier for the sender,
>> because it needs to keep track of all the return callbacks for every
>> item in to queue to know when it can reuse the byte[], or ByteBuffer
>> or Object. One item is reasonable to keep track of, but multiple
>> items are probably an unfair burden on the user.
>>
>> Basically, the behavior veers off into under-defined problems if
>> queues are allowed for that call, but only partially defined.
>>
>> And, just to be clear 3) the implementation doesn't want a timed wait
>> (like tcp nagle), because timer overhead is high.
>>> Its a bit tricky imposing a different development model on top of
>>> what we have, especially because I'll bet there will be some
>>> implementations that will not support batching.
>>
>> I don't think I understand. I don't think this would be a different
>> development model.
>>
>> Any websocket implementation can always ignore the auto-flush=false
>> and send the message without delay. That would be perfectly
>> spec-compliant behavior.
>>
>> The application code would work in both cases; it would just be
>> faster in one that supported batching.
>>
>>> I have some ideas on a subtype of RemoteEndpoint which might
>>> separate out the batching model better than the flags and the
>>> flush(), but lets see.
>>
>> Cool! I'm looking forward to it.
>>
>> -- Scott
>>>
>>> I'm flagging this in the spec for v10 because the spec has not
>>> resolved this yet.
>>>
>>> - Danny
>>>
>>>>
>>>> -- Scott
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> - Danny
>>>>>
>>>>>
>>>>>
>>>>> On 11/29/12 12:11 PM, Scott Ferguson wrote:
>>>>>> On 11/29/12 11:34 AM, Danny Coward wrote:
>>>>>>> My apologies Scott, I must have missed your original request -
>>>>>>> I've logged this as issue 63.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>
>>>>>>> So auto flush true would require the implementation never keep
>>>>>>> anything in a send buffer, false would allow it ?
>>>>>>
>>>>>> Not quite. It's more like auto-flush false means "I'm batching
>>>>>> messages; don't bother sending if you don't have to." I don't
>>>>>> think the wording should be "never", because of things like mux,
>>>>>> or other server heuristics. It's more like "start the process of
>>>>>> sending."
>>>>>>
>>>>>> setBatching(true) might be a better name, if that's clearer.
>>>>>>
>>>>>> When setBatching(false) [autoFlush=true] -- the default -- and
>>>>>> an app calls sendString(), the message will be delivered (with
>>>>>> possible buffering, delays, mux, optimizations, etc, depending on
>>>>>> the implementation, but it will be delivered without further
>>>>>> intervention from the app.)
>>>>>>
>>>>>> When setBatching(true) [autoFlush=false], and an app calls
>>>>>> sendString(), the message might sit in the buffer forever until
>>>>>> the application calls flush().
>>>>>>
>>>>>> sendPartialString would be unaffected by the flag; the WS
>>>>>> implementation is free to do whatever it wants with partial messages.
>>>>>>
>>>>>> Basically, it's a hint: setBatching(true) [autoFlush=false] means
>>>>>> "I'm batching a bunch of messages, so don't bother sending the
>>>>>> data if you don't need to until I call flush."
>>>>>>
>>>>>> Does that make sense? I don't want to over-constrain
>>>>>> implementations with autoFlush(true) either option. Maybe
>>>>>> "batching" is the better name to avoid confusion. (But even
>>>>>> batching=true doesn't require buffering. Implementations can
>>>>>> still send fragments early if they want or even ignore
>>>>>> batching=true.)
>>>>>>>
>>>>>>> It seems like a reasonable request - do you think the autoflush
>>>>>>> property is a per-peer setting / per logical endpoint / per
>>>>>>> container setting ? I'm wondering if typically developers will
>>>>>>> want to set this once per application rather than keep setting
>>>>>>> it per RemoteEndpoint.
>>>>>>
>>>>>> I think it's on the RemoteEndpoint, like setAutoCommit for JDBC.
>>>>>> It's easy to set in @WebSocketOpen, and the application might
>>>>>> want to start and stop batching mode while processing.
>>>>>>
>>>>>> -- Scott
>>>>>>
>>>>>>>
>>>>>>> - Danny
>>>>>>>
>>>>>>> On 11/28/12 3:28 PM, Scott Ferguson wrote:
>>>>>>>>
>>>>>>>> I'd like a setAutoFlush() and flush() on RemoteEndpoint for
>>>>>>>> high performance messaging. Defaults to true, which is the
>>>>>>>> current behavior.
>>>>>>>>
>>>>>>>> The performance difference is on the order of 5-7 times as many
>>>>>>>> messages in some early micro-benchmarks. It's a big improvement
>>>>>>>> and puts us near the high-speed messaging like ZeroQ.
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> <http://www.oracle.com> *Danny Coward *
>>>>>>> Java EE
>>>>>>> Oracle Corporation
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> <http://www.oracle.com> *Danny Coward *
>>>>> Java EE
>>>>> Oracle Corporation
>>>>>
>>>>
>>>
>>>
>>> --
>>> <http://www.oracle.com> *Danny Coward *
>>> Java EE
>>> Oracle Corporation
>>>
>>
>
>
> --
> <http://www.oracle.com> *Danny Coward *
> Java EE
> Oracle Corporation
>