[jsr340-experts] Re: Proposal for WebSocket to be part of JSR 340

From: Remy Maucherat <rmaucher_at_redhat.com>
Date: Thu, 06 Oct 2011 00:39:02 +0200

On Wed, 2011-10-05 at 08:34 -0700, Rick Hightower wrote:
> - Websocket is labelled as the next big thing. To be honest, I think
> it
> has serious issues, especially the unlimited frame size. It means no
> corruption detection for missing data, and the API becomes
> exponentially
> more complex.
>
>
> *** I don't agree, but sort of a moot point if we are going to add
> support.

Ok. The reality, however, is that if frames were limited in size, a
buffer based API could be used, which is a lot easier to have scale and
be easy to use than a stream API.

A stream API with blocking IO (InputStream and OutputStream are blocking
IO) may block while waiting for more data for the current frame on
input, or while the client is backlogged on output. To solve this, extra
APIs are needed, like we are trying to add for the stream APIs in
Servlet 3.1.

Another problem adding complexity on the lower layer: the text (UTF-8)
frame. This is a very funny feature, but the mixing between text and
binary is as close to useless as you can get. The higher level protocol
built on that could have dealt with the encoding I believe.

> *** My understanding is.... in order to stop frame corruption, you
> can't really have two threads writing to the same client at the same
> time. At some level something has to coordinate threads writing to the
> same client. This is not a forever block, but a block used to
> coordinate message sending. Just imagine two clients trying to write
> to the same TCP/IP connection. You can't have half a message from one
> thread intermixed with half a message from a second thread, intermixed
> with a third and so on.

Syncing is expensive, and there's no reason for it to be automatic, this
hurts performance for no reason. If the user needs it because he somehow
thinks writing from multiple threads is such a great plan, syncing on
the context object is trivial. The same design is in Servlet 3 when
async is used (sync if needed to avoid corruption).

The goal is that the IO of a webapp is done with two threads (one for
input, one for output), so there are really no syncs needed, and, as a
result, ideal scalability.

> - No filtering is a problem, it does not sound good to label it as "if
> needed". I simply don't understand how it would not be needed
> immediately.
>
>
> *** Give an example where it would be used. To implement what?
> People use TCP/IP all the time, is there a packet filter in the
> servlet spec so that
> developers can inspect packets? No. WebSocket frames are at a very low
> level.
> I am not saying never. I am saying it does not have to be in the first
> release.
> If you feel it is a must, I would rather have it then not have
> WebSocket.
> I just think it is a distraction.

Using any third party utility or framework will require a filtering API,
like there is Servlets. Servlets (and JSP) didn't really take off beyond
gimmick stage until frameworks appeared.

> - The portion on lock design for output based on hijacking close()
> methods from streams is not right. The goal is to use one thread to
> service as many clients as possible [using a non blocking API], not to
>
>
> ** The threading model is split. The handlers are handled like
> Servlets.
> It is non-blocking. It is only the writing that needs sync and the
> writing can and most likely will happen in another thread.
>
> use multiple threads to service many clients using a blocking API [if
> it
> was non blocking, then this model is also wrong].

So that's not the right threading model, it doesn't scale very well.
Essentially, you need some event gathering (as async threads or
callbacks from some other container), two threads for IO, and the rest
of the callbacks from the container threads about messages on input.

> - I think integration with Servlet API needs something. There is no
> callback at all here, on purpose, but I don't agree with that choice.
>
>
> ** What would the callback do? Come up with a use case for it. I can't
> think of any.

You quoted the HTTP session yourself as if this was an evidence. All
users are going to want to access it, and the rest will want to access
the principal that was authenticated by the HTTP layer.

Another thing the API would need is control over message fragmentation
in multiple frames. I don't see in your spec anything which prevents a
protocol from taking advantage of precise message fragmentation, so this
looks like a must to me.

-- 
Remy Maucherat <rmaucher_at_redhat.com>
Red Hat Inc