[jsr340-experts] Re: Proposal for WebSocket to be part of JSR 340

From: Greg Wilkins <gregw_at_intalio.com>
Date: Fri, 7 Oct 2011 09:40:35 +1100

On 6 October 2011 19:51, Remy Maucherat <rmaucher_at_redhat.com> wrote:
> On Thu, 2011-10-06 at 12:04 +1100, Greg Wilkins wrote:

> Websockets messages can be made of any number of frames. So what's the
> point with having frames be unlimited in size ?

IMHO there is no point. I wanted a 2^16 max frame size (and there was
much support in the WG for that).
But the unlimited frame size was adopted because some argued that they
wanted to do send file gather writes of large messages. I have no
idea why these large messages have to go as a single frame and they
will probably get fragmented into small frames at the first
intermediary with anyway!

> Thanks for the link. I read there Jetty supports the full frame size
> with your buffer based API, is that really accurate ?

If Jetty receives a stupidly large frame, then it fragments it on
arrival. ie each time the buffer fills, it creates a false fragment
frame and passes that up the food chain. If the API is working in
buffered message mode, then that will eventually hit the message limit
and fail, but Jetty does allow frame by frame processing as well, so
it could consume a stupidly large frame. We can't generate frames
larger that 2^31 (max size of byte[]).

> Realistically, we're not going to be able to arbitrarily limit this
> here, it was ok for early adopters impl, but moving to production it is
> not. 16MB is way way too big to buffer and scale already anyway IMO.

I agree that 16MB is too large.

But we could have an in container limit without it being arbitrary.
We could expose a setter to the application for it to set a max
message size that it would like the container to buffer and call back
with when it is complete.

Sure this can still result in large messages in memory, but having a
streaming interface does not solve this, it just puts the buffer
management beyond the control of the container and applications will
use inefficient growing StringBuilders instead.

Also, no matter what we do, there are already arbitrary limits in the
browsers. If we try to send big messages to the browsers, they will
just fail when >16MB. So having a streaming interface is kind of
pretending that big messages will always be acceptable when they wont.
We need to have drivers towards reasonable sized messages.

>> So I strongly believe that we should offer a simple message based API
>> (byte[] and String), with limits imposed and enforced by the
>> container, which will run the correct error conversations and any
>> negotiations as needed.
>
> Usually when a frame is that large, it is because the application is
> about file transfer or media content distribution. In either case, the
> application does not need a full frame to do something. A buffer API is
> not protective, it will only waste a lot of memory in many cases.

True. If an application wants to transfer files over WebSocket in a
single message (regardless of frame size), then asking the container
to buffer the entire message is wasteful.

> So I don't see how to avoid using streams ... Unfortunately, that's
> where the complexity kicks in to have non blocking.

There are several solutions other than streaming:

1) The container can buffer large messages to disk (like the current
file upload). Limits would still be needed to avoid filling up the
disk.
2) Break the uploaded file into multiple small WS messages - this can
be encouraged by the container having limits on how large the message
limit can be set.
3) Have an API that supports both messages and streaming.

I'm not opposed to having streaming as an option. But it really
should not be the normal API as the vast majority of WS usage is going
to be sending simple XML/JSON messages and streaming them into buffers
or parsers will just expose applications to unnecessary complexity and
the need to enforce memory limits.

> The protocol is not final yet, we can still hope for some unexpected
> reversal, and they suddenly drop the 64bit size for frames (effectively
> limiting them to 16bit, which would make a buffer API great).

We can hope.... but unfortunately it is not going to happen.

Besides, it does not help us, because the intent of the WS design is
to have semantic meaning only at message boundaries, so small frames
can still be used to build huge messages.

> For API design options, assuming the 64bit limit stays, we could use a
> pure stream based API, with non blocking flags like Rajiv's first
> proposal. This would make it look a bit similar to classic Servlet, with
> added controls for frame fragmentation on output. It is also possible to
> have a companion buffer (ByteBuffer and CharBuffer please, I'd like to
> avoid straight byte[] and Strings to be honest) based API which would be
> used when the frames have a 16bit length, it could help a lot folks who
> design protocols that only use this type of frame (essentially rewarding
> them for being reasonable).

Sure, this is my option 3 above. I still question if the effort for
streams will be worthwhile given that arbitrary limits already exist,
but if others feel it is necessary then supporting both APIs is best.

+1 for CharBuffer and ByteBuffer

cheers