Hello everyone - apologies for the late response - I got myself stuck in
limbo.
As an exercise in testing the capabilities of the current proposed API, I
reviewed over 30 in-production WebSocket applications, written in a handful
of languages (Java, Python, Node, Ruby, C++). My intention was looking for
existing use of WebSockets, where it was not obvious (or possible) to use
the proposed API to recreate the feature.
These were the blockers I found. I know many of these have been discussed
already - I just wanted to summarize them in one place:
** Remote User*
Virtually every WebSocket handler I encountered cared about the end user.
While we expose HttpSession, we do not provide access to the user Principal.
** HTTP headers, cookies, etc*
Similarly, many looked at HTTP headers at handshake time - mostly to access
cookies.
** User explicit ping/pong*
A few 'mission critical' applications explicitly send and react to pings
and pongs to establish a more reliable heartbeat protocol. In some cases,
this is to ensure that irregular disconnection or frozen devices are
spotted promptly (from perspectives of both client and server). In other
cases, WebSockets are being used as gateway to other systems and the
pings/pongs are passed through to allow the backend systems to respond -
providing more reliable end-to-end health checking.
** Cross cutting concerns*
It wasn't clear whether javax.servlet.Filters would intercept the
WebSocket. In some of the apps there are cross-cutting interceptors that
perform functions such as: authentication/authorization, auditing
(recording all messages across all endpoints for compliance reasons),
monitoring (timings, throughput, latency, etc), and session tracking (who's
currently connected).
** Upgrading HTTP requests*
Not clear how URL endpoints could be defined that can handle both vanilla
HTTP requests (i.e. a Servlet) and WebSockets, depending on whether an
upgrade was requested.
** Non-blocking closes*
While there are non-blocking versions of the send() methods, there is no
way to do a non-blocking close().
** Callback threads*
The majority (but not all) of the WebSocket apps I looked at used a single
threaded non-blocking (reactor) approach. These were pretty simple to
reason with from a concurrency perspective because you know only one
callback is ever executing at a time, so no locking etc. Although this
proposed API offers lots of eventy non-blocking methods (hasOpened,
onMessage, setResult, etc), it does not make the threading model clear -
callbacks could occur on any thread, so the user has to go lengths to
coordinate these. Passing java.util.concurrent.Executors to any method that
results in a callback would be a simple and flexible way to remedy this,
and would cater for both the non-blocking single event loop and blocking
multi-threaded usages.
---- Minor API questions/suggestions/opinons ----
*Endpoint.hasXXX()*: More of a stylistic thing, but elsewhere the onXXX
convention is used - can we be consistent?
*Endpoint.handleError()*: This seems a bit generic. At the very least, we
should be explicit on whether this is for handling uncaught exception in
user code (MessageListeners), or network related problems - which require
very different treatment depending on the nature of the application.
*Endpoint.handleError()*: Should this be able to handle Throwable (not just
Exception)?
*Session.addEncode()/Session.addMessageListener()*: Should we also supply
equivalent remove methods? An example case for this is when an initial
listener needs to replace itself with a more appropriate listener once it
learns more about the client.
*Session.setMaximumMessageSize()/Session.setTimeout()*: What are the
defaults if these are not set (container specific)? Does the user have the
facility to set inifinite values (i.e. never timeout)?
*Session.isActive()/Session.isSecure()*: Danny, I think you messed up the
source code here... isSecure() does not appear in the JavaDoc, and
isActive() contains a bit of isSecure().
*Session.getHttpSession()*: HttpSessions are typically considered server
side things - this makes little sense for a WebSocket client. Although we
want to maintain as much API consistency as possible between server and
client, we also need to cater for things that don't make sense in a both
contexts.
*Session*: I can't imagine doing anything useful without the HttpRequest
(allowing users to get things like user Principal, request parameters, HTTP
headers, cookies, remote host, etc). At the same time I see the argument
against (especially in a client, or non Servlety deployment). Maybe there
should be sub-interfaces that provide more environment specific
capabilities (e.g. ClientSession, ServerSession).
*Session*: Vocabulary thing... Session is such an overloaded term in the
JEE space. How about something with a more meaningful description, like
Connection?
*RemoteEndpoint*: If the user is not using a custom encoder, what would
type <T> be? Would they have to pass RemoteEndpoint<Void> around their code?
*EncodeException/DecodeException*: Given that encoding is an IO related
activity, should this extend IOException?
*ServerConfiguration.checkOrigin()/getPreferredSubprotocol()*: It wasn't
clear how to change the behavior here. Is ServerConfiguration meant to be
subclassed? This seems a little messy as it inherits members which are
clearly not meant to be overriden (e.g. getURI()).
*MessageListener*: This would be cleaner if it was generified with a void
onMessage(T data) method, as opposed to an empty interface. It will avoid
instanceof checks and casting elsewhere in infrastructure and cross-cutting
services (e.g. generic MessageListener decorators). Ditto for
Encode/Decoder.
*MessageListener.onMessage()*: It would be convenient if this also provided
a Session parameter, so a single MessageListener instance could be used to
handle multiple connections.
*Close.Code.TLS__HANDSHAKE_FAILURE*: Why the double underscore?
---- Other thoughts ----
** Streaming bytes*
I hear the argument on being able to open up streams for writing bytes -
especially the convenience of working with existing APIs that expect
streams. But I'm skeptical on using streams for handling large messages -
most WebSocket apps are going to be interacted with from a browser and the
browser side API does not provide a streaming message - you have to work
with a message in its entirety.
** Annotations*
I have no objection to the annotations, but at the same time, I'm not
seeing how they make things better. Could someone please clarify?
cheers
-Joe