Hi Scott,
OK, well, in order that we can get wider feedback, I've put the
setAutoBatching() and flush() into the RemoteEndpoint API, which gives
developers a mode to take advantage of this on containers that support
it, while still remaining portable on containers that do not support
this batching optimisation.
The refactoring I was thinking about was essentially to introduce
different kinds of RemoteEndpoint to separate out what are essentially
different programming modes for sending. Something like this:-
RemoteEndpoint.Basic - synchronous send of whole messages
RemoteEndpoint.Async - asynchronous send of whole messages
RemoteEndpoint.Stream - sending messages using blocking I/O and/or
partial messages
RemoteEndpoint.Batch - sending messages that may be batched
and have corresponding methods on Session to get the RemoteEndpoint that
supports the mode you want to you.
But, I will think though a fully worked out API diff in the coming weeks.
- Danny
On 12/14/12 7:22 PM, Scott Ferguson wrote:
> On 12/14/12 5:13 PM, Danny Coward wrote:
>> Hi Scott,
>>
>> On 12/11/12 11:37 AM, Scott Ferguson wrote:
>>>
>>> It doesn't work because the *ByFuture and *ByCompletion are single
>>> item queues. You can't batch or queue more than one item with those
>>> APIs. If you look at java.nio.channels.AsynchronousChannel, it says
>>>
>>> " Some channel implementations may support concurrent reading and
>>> writing, but may not allow more than one read and one write
>>> operation to be outstanding at any given time."
>>>
>>> Since it's only a single-item queue, the implementation can't batch
>>> the items -- there's only one item.
>> Right, but are our APIs are only expressed in terms of the data
>> objects not the channels. Does it really lock you into using
>> AsynchronousChannel with its one-write-at-a-time rule ?
>
> Not directly, but the same issues apply in our case as theirs, so I'm
> pretty sure we want to follow their one-write-at-a-time.
>
>>
>>>
>>> And it's a single-item queue because multiple-item queues require
>>> more API methods, like in BlockingQueue, and a longer spec
>>> definition to describe the queue behavior, e.g. what happens when
>>> the queue is full or even what "full" means.
>> I'm probably being really stupid, but can't an implementation use a
>> BlockingQueue under our APIs, and determine itself based on a
>> knowledge of its own implementation environment when to send a batch
>> of messages ?
>
> Unfortunately, no. I tried working through that exercise, but it runs
> into problems.
>
> 1) If the user is allowed to call more than once (i.e. queue
> messages), then either the queue size must be infinite, or the "async"
> call would eventually block (put), or throw an exception (add), or
> return a boolean (offer). None of those are good default options,
> because none matches the notion of async.
>
> Basically, an API that acts like queue needs all 3 main offer/put
> method types that blocking queue provides, or be infinite, or select
> one of the methods as a default behavior. I'm pretty sure we don't
> want to do any of those.
>
> 2) The life of the passed objects gets a bit trickier for the sender,
> because it needs to keep track of all the return callbacks for every
> item in to queue to know when it can reuse the byte[], or ByteBuffer
> or Object. One item is reasonable to keep track of, but multiple items
> are probably an unfair burden on the user.
>
> Basically, the behavior veers off into under-defined problems if
> queues are allowed for that call, but only partially defined.
>
> And, just to be clear 3) the implementation doesn't want a timed wait
> (like tcp nagle), because timer overhead is high.
>> Its a bit tricky imposing a different development model on top of
>> what we have, especially because I'll bet there will be some
>> implementations that will not support batching.
>
> I don't think I understand. I don't think this would be a different
> development model.
>
> Any websocket implementation can always ignore the auto-flush=false
> and send the message without delay. That would be perfectly
> spec-compliant behavior.
>
> The application code would work in both cases; it would just be faster
> in one that supported batching.
>
>> I have some ideas on a subtype of RemoteEndpoint which might separate
>> out the batching model better than the flags and the flush(), but
>> lets see.
>
> Cool! I'm looking forward to it.
>
> -- Scott
>>
>> I'm flagging this in the spec for v10 because the spec has not
>> resolved this yet.
>>
>> - Danny
>>
>>>
>>> -- Scott
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>> - Danny
>>>>
>>>>
>>>>
>>>> On 11/29/12 12:11 PM, Scott Ferguson wrote:
>>>>> On 11/29/12 11:34 AM, Danny Coward wrote:
>>>>>> My apologies Scott, I must have missed your original request -
>>>>>> I've logged this as issue 63.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>
>>>>>> So auto flush true would require the implementation never keep
>>>>>> anything in a send buffer, false would allow it ?
>>>>>
>>>>> Not quite. It's more like auto-flush false means "I'm batching
>>>>> messages; don't bother sending if you don't have to." I don't
>>>>> think the wording should be "never", because of things like mux,
>>>>> or other server heuristics. It's more like "start the process of
>>>>> sending."
>>>>>
>>>>> setBatching(true) might be a better name, if that's clearer.
>>>>>
>>>>> When setBatching(false) [autoFlush=true] -- the default -- and an
>>>>> app calls sendString(), the message will be delivered (with
>>>>> possible buffering, delays, mux, optimizations, etc, depending on
>>>>> the implementation, but it will be delivered without further
>>>>> intervention from the app.)
>>>>>
>>>>> When setBatching(true) [autoFlush=false], and an app calls
>>>>> sendString(), the message might sit in the buffer forever until
>>>>> the application calls flush().
>>>>>
>>>>> sendPartialString would be unaffected by the flag; the WS
>>>>> implementation is free to do whatever it wants with partial messages.
>>>>>
>>>>> Basically, it's a hint: setBatching(true) [autoFlush=false] means
>>>>> "I'm batching a bunch of messages, so don't bother sending the
>>>>> data if you don't need to until I call flush."
>>>>>
>>>>> Does that make sense? I don't want to over-constrain
>>>>> implementations with autoFlush(true) either option. Maybe
>>>>> "batching" is the better name to avoid confusion. (But even
>>>>> batching=true doesn't require buffering. Implementations can still
>>>>> send fragments early if they want or even ignore batching=true.)
>>>>>>
>>>>>> It seems like a reasonable request - do you think the autoflush
>>>>>> property is a per-peer setting / per logical endpoint / per
>>>>>> container setting ? I'm wondering if typically developers will
>>>>>> want to set this once per application rather than keep setting it
>>>>>> per RemoteEndpoint.
>>>>>
>>>>> I think it's on the RemoteEndpoint, like setAutoCommit for JDBC.
>>>>> It's easy to set in @WebSocketOpen, and the application might want
>>>>> to start and stop batching mode while processing.
>>>>>
>>>>> -- Scott
>>>>>
>>>>>>
>>>>>> - Danny
>>>>>>
>>>>>> On 11/28/12 3:28 PM, Scott Ferguson wrote:
>>>>>>>
>>>>>>> I'd like a setAutoFlush() and flush() on RemoteEndpoint for high
>>>>>>> performance messaging. Defaults to true, which is the current
>>>>>>> behavior.
>>>>>>>
>>>>>>> The performance difference is on the order of 5-7 times as many
>>>>>>> messages in some early micro-benchmarks. It's a big improvement
>>>>>>> and puts us near the high-speed messaging like ZeroQ.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> <http://www.oracle.com> *Danny Coward *
>>>>>> Java EE
>>>>>> Oracle Corporation
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> <http://www.oracle.com> *Danny Coward *
>>>> Java EE
>>>> Oracle Corporation
>>>>
>>>
>>
>>
>> --
>> <http://www.oracle.com> *Danny Coward *
>> Java EE
>> Oracle Corporation
>>
>
--
<http://www.oracle.com> *Danny Coward *
Java EE
Oracle Corporation