[jsr356-experts] Re: Streaming API: was For Review: v002 API and example code

From: Scott Ferguson <ferg_at_caucho.com>
Date: Tue, 10 Jul 2012 09:02:15 -0700

On 07/10/2012 07:21 AM, Justin Lee wrote:
> I agree with Greg here. The advantage of websockets is the async
> nature and then lends itself well to a callback style interface.
> Trying to shoe horn a blocking API on top of it simply because it's
> what people are used to or it's how the servlet API works are not
> compelling arguments for me at all. Websockets (and the increasingly
> popular SPDY) are fundamentally different ways to communicate and this
> API should reflect that.

The message start is async as the nature of websockets. Reading a
message is not intrinsically async.

It is important to distinguish between the two concepts.

The onMessage() callback and the startMessage() calls are fundamentally
different ways to communicate, including when that onMessage() callback
and startMessage() then use blocking calls to encode/decode the message
contents.

-- Scott

>
> With regards to fragments, in Grizzly I had the onFragment() callback
> (partially stolen from jetty, iirc) that would pass not only the
> byte[] but also a flag indicating if it was the last of the data.
> isLast() feels like an API wart to me, personally.
>
> I'm not sure the Encoder stuff is entirely necessary. Perhaps it'd be
> convenient to provide hooks but i'd rather not codify one encoding
> scheme over another.
>
> The API overall feels slightly awkward to me. I need to sit down and
> stare at the examples, though, to get some concrete comments.
>
> On Tue, Jul 3, 2012 at 9:59 AM, Greg Wilkins <gregw_at_intalio.com
> <mailto:gregw_at_intalio.com>> wrote:
>
> On 30 June 2012 02:19, Scott Ferguson <ferg_at_caucho.com
> <mailto:ferg_at_caucho.com>> wrote:
> > On 06/29/2012 04:03 PM, Danny Coward wrote:
> >
> > Hi Scott, all,
> >
> > Thanks for the feedback, pls see below:-
> >
> >
> >
> > OK. So we can certainly add streaming to process messages.
> >
> > But, are you suggesting using blocking Java i/o streams, like
> the servlet
> > api, to represent that ? Something similar or equivalent to:-
> >
> > WebSocketListener -> public void
> onMessageAsInputStream(InputStream is)
> > RemoteEndpoint -> public OutputStream startSendByOutputStream() ?
> >
> >
> > Yes. (Also with the equivalent Reader/Writer of course.)
> >
> > It's necessary for the core use-case of serializing JSON, XML,
> proto-buf,
> > hessian, etc, messages, as well as for custom protocols like
> STOMP over
> > web-socket, or rewriting ZeroMQ over websocket. For example XMPP
> over
> > WebSocket is already a draft proposal.
> >
> > Streams are also better for high-performance and for very large
> messages.
>
> I'm generally stream sceptical. We are doing websockets so we can
> scale better and going for a blocking API just does not feel like the
> right thing to do to achieve that.
> However, the use-cases Scott has outlined are compelling - for
> compatibility with existing libraries having a stream is necessary to
> avoid buffering entire messages.
>
> However, that is not to say that we can provide a primary websocket
> API that is not stream based and supports non blocking partial message
> writes - and then on top of that we can layer a blocking output
> stream. Ditto for receive and input.
>
>
> > And possibly exploring the type of additions that would allow
> non-blocking
> > IO based on the traditional i/o streams as well, such as the
> servlet expert
> > group has been looking at for servlet 3.1 ?
> >
> >
> > Possibly. I'd personally prefer nailing down the core blocking
> API first
> > before discussing the more complicated non-blocking APIs. I'd
> like to match
> > the servlet model as much as possible, while avoiding javax.servlet
> > dependencies.
>
> I see it the other way. Doing non-blocking APIs is hard and harder if
> you have to be compatible with a blocking API that was conceived
> without consideration for non-blocking (see the mess in servlet 3.1
> JSR now!).
> On the otherhand, if we have a good non-blocking API, creating
> blocking derivatives is normally trivial (eg like providing stream
> wrappers).
>
>
> > Although at least one of the APIs allows this, most APIs seem to
> favor a
> > type of asynchronous processing same as or equivalent to:-
> >
> > WebSocketListener-> public void onMessage(byte[] fragment,
> boolean isLast)
> > RemoteEndpoint-> public void send(byte[] fragment, boolean isLast)
> >
> > What are people's thoughts on standardizing this kind of
> chunking API ?
>
> Well firstly we'd have to standardize the language - chunk is kind of
> taken by HTTP and fragment is kind of associated with frame.
>
> I'm cool with having an API that sends and receives partial messages,
> but we have to be clear that those partial messages are unrelated to
> (or at least decoupled from) websocket frames. Ie a single websocket
> frame might be delivered in multiple calls to onMessage(byte[]
> fragment, boolean isLast), or multiple frames might be delivered in a
> single call. Further a call to send(byte[] fragment, boolean isLast)
> may result in 1 or more frames being sent.
>
>
> [ yes I realise that I'm mixing up responses to Danny and Scott in the
> same message - but my mail client is confused ]
>
>
>
>
>
>
> > Well, let me unpack this because there are several intertwined
> issues that
> > should be separated:
> >
> > 1) The WebSocket frame/fragment is not an application-visible
> concept
> > (excluding extensions for the moment).
>
> +1
>
> > 2) Even for extensions, the IETF WebSocket Multiplexing group is
> finding
> > that frame-based extensions are a problem, because they interact in
> > difficult ways. There's a suggestion on one of their threads
> that perhaps
> > only the mux extension itself even be aware at all about frames,
> and all
> > other extensions work on messages.
>
> Indeed - messing with the frame headers can cause all sorts of grief.
> I think most extensions should be restricted to mutating payloads
> (including fragmenting them).
>
>
> > You could have "buffer" as in write(buffer, offset, length,
> isLast) just to
> > make it absolutely clear that the sending buffer has nothing to
> do with
> > websocket frames, but...
> >
> > 3) async sends are a problem because they imply queuing or at
> least large
> > buffering, and raise the question of who owns the buffer, memory
> allocation,
> > if there are extra copies just to handle the async, etc. These
> aren't
> > appropriate for the low level API. (A higher async
> messaging/queuing on top
> > of the low level API might be fine, but not the lowest level.)
>
> I think NIO.2 got async sends pretty well right. Send the passed
> buffers and call me back when the send is completed. Those buffers do
> not need to be copied and internally an implementation can slice off
> frame by frame from a huge passed buffer and control it's own memory
> footprint.
>
> NIO.2 got async reads a bit wrong as you need to allocated buffers
> when you schedule the read. But in this context we are calling back
> with the message (or partial message), so that is less of a problem.
>
>
> > 4) the "isLast" kind of API (assuming blocking) is functionally
> equivalent
> > to a stream, but someone would need to write a stream wrapper if
> they're
> > serializing xml, json, etc. Which isn't really something that
> the API should
> > force on an application.
>
> I agree that isLast is not a nice API and does not entirely solve
> the problem.
>
>
> > 5) async/isLast receives are truly messy if you're deserializing
> xml, json,
> > etc, because you either need to buffer the entire message (!) before
> > starting to deserialize, or create a complicated threaded
> producer/consumer
> > model to create a stream wrapper. Again, this is fairly brutal
> to require of
> > an application.
>
> +1 Such applications should not attempt async serialization and
> should have a stream abstraction at their disposal.
> But that does not mean that we cannot expose an async API below the
> stream abstraction.
>
>
> > Although the base layer shouldn't be an async/chunked API, I'm
> all for an
> > async/messaging layer written on top of the simple blocking
> streaming layer,
> > if it's a general solution useful for a large class of applications.
>
> I'm the other way around. Let's get async sorted and then
> blocking is easy.
> Async will be hard and ugly no matter what, but less so if we don't
> have pre-existing blocking API to work around.
>
>
> --
> Greg Wilkins <gregw_at_intalio.com <mailto:gregw_at_intalio.com>>
> www.webtide.com <http://www.webtide.com>
> Developer advice, services and support
> from the Jetty & CometD experts.
>
>
>
>
> --
> You can find me on the net at:
> http://antwerkz.com <http://antwerkz.com/> http://antwerkz.com/+
> http://antwerkz.com/twitter http://antwerkz.com/github
>