[jsr356-users] [jsr356-experts] Re: Streaming API: was For Review: v002 API and example code

From: Remy Maucherat <rmaucher_at_redhat.com>
Date: Mon, 02 Jul 2012 14:35:55 +0200

On Fri, 2012-06-29 at 17:19 -0700, Scott Ferguson wrote:
> > And possibly exploring the type of additions that would allow
> > non-blocking IO based on the traditional i/o streams as well, such
> > as the servlet expert group has been looking at for servlet 3.1 ?
>
> Possibly. I'd personally prefer nailing down the core blocking API
> first before discussing the more complicated non-blocking APIs. I'd
> like to match the servlet model as much as possible, while avoiding
> javax.servlet dependencies.
>
> >
> >
> > Although at least one of the APIs allows this, most APIs seem to
> > favor a type of asynchronous processing same as or equivalent to:-
> >
> > WebSocketListener-> public void onMessage(byte[] fragment, boolean
> > isLast)
> > RemoteEndpoint-> public void send(byte[] fragment, boolean isLast)
> >
> > What are people's thoughts on standardizing this kind of chunking
> > API ?
>
> Well, let me unpack this because there are several intertwined issues
> that should be separated:
>
> 1) The WebSocket frame/fragment is not an application-visible concept
> (excluding extensions for the moment). The application-visible concept
> in WebSocket is the message. In fact, the early IETF drafts only had
> messages and no frames.
>
> The fragment is supposed to be like TCP/IP frames. Although they
> exist, applications can't use them or even really be aware of the
> boundaries. It's for WebSocket protocol implementations and proxies to
> split/join fragments as needed (and specifically for mux, which is a
> core websocket implementation extension, not a user extension.)

Ok, we can validate that and not try to specify anything related to
frames in the JSR. I'm a bit worried to take away such a big part of the
specification, though, even if this supposedly as intended originally.

We probably all know that once a feature is given to users, it will
inevitably be abused in very creative ways.

> 2) Even for extensions, the IETF WebSocket Multiplexing group is
> finding that frame-based extensions are a problem, because they
> interact in difficult ways. There's a suggestion on one of their
> threads that perhaps only the mux extension itself even be aware at
> all about frames, and all other extensions work on messages.
>
> So "fragment" is wrong for the application, and probably even
> extensions.
>
> You could have "buffer" as in write(buffer, offset, length, isLast)
> just to make it absolutely clear that the sending buffer has nothing
> to do with websocket frames, but...
>
> 3) async sends are a problem because they imply queuing or at least
> large buffering, and raise the question of who owns the buffer, memory
> allocation, if there are extra copies just to handle the async, etc.
> These aren't appropriate for the low level API. (A higher async
> messaging/queuing on top of the low level API might be fine, but not
> the lowest level.)
>
> 4) the "isLast" kind of API (assuming blocking) is functionally
> equivalent to a stream, but someone would need to write a stream
> wrapper if they're serializing xml, json, etc. Which isn't really
> something that the API should force on an application.
>
> 5) async/isLast receives are truly messy if you're deserializing xml,
> json, etc, because you either need to buffer the entire message (!)
> before starting to deserialize, or create a complicated threaded
> producer/consumer model to create a stream wrapper. Again, this is
> fairly brutal to require of an application.
>
> 6) interaction with the multiplexing layer. There's a fairly good
> chance that the multiplexing extension will be approved. If so, the
> core messaging API should continue to work with mux exactly as-is
> without any application changes.
>
> Mux itself will need to refragment/block/buffer as necessary. So the
> application buffers sends won't necessarily have any relation to the
> actual frames once mux is done working with it.
>
> ... So basically, the async/chunk style APIs are a problem.
>
> Although the base layer shouldn't be an async/chunked API, I'm all for
> an async/messaging layer written on top of the simple blocking
> streaming layer, if it's a general solution useful for a large class
> of applications.

Keeping the comparison with Servlet [3.1], it (in the current draft)
uses the regular stream based blocking IO, with optional flags and
listeners to allow async IO if the application can take advantage of it.
So async is not mandatory, the application can use simpler less scalable
IO.

I would expect frameworks to take advantage of this WS JSR, I don't
believe too many people will write directly for it. As such, even if it
does sound convoluted that an individual user would take advantage of
async in most cases, frameworks have a lot more reasons to spend time to
do some additional plumbing and get better scalability.

-- 
Remy Maucherat <rmaucher_at_redhat.com>
Red Hat Inc