[jsr340-experts] Re: Servlet 3.1 EDR updated draft

From: Remy Maucherat <rmaucher_at_redhat.com>
Date: Thu, 31 May 2012 11:51:21 +0200

On Thu, 2012-05-31 at 11:13 +0200, Greg Wilkins wrote:
> There is a big difference in buffering.
>
> Currently if someone does write(OneGigByteArray), then a container is
> able to use arbitrary sized buffers to send that. Ie I can use a few
> 16k buffers to slice off bits of the 1G and encrypt as SSL etc. and
> send that 1G. Ie the container is in control of it's memory
> footprint.
>
> But with what you propose, the container will have to buffer
> OneGigMinusALittleBit. The container is now out of control of it's
> memory footprint and effectively is doubling down on any allocations
> the application makes!
>
> I think it far better to use option c). where the container does not
> copy the passed byte arrays and just keeps a reference to them. In
> many cases they will be written immediately (either into a buffer,
> into ssl or onto the network) and the buffer can be reused
> immediately. But in the case that canWrite returns false after a
> write, then the passed byte array should not be changed until canWrite
> is true again. This gives pretty much the same semantics, but
> allowing the container to control its memory footprint.

With option c), the container has to keep a reference to the one GB byte
array, so the same memory is still referenced. But this sort of
application will never be able to actually work anyway, so why are we
discussing it ? b) is simpler for the user, and also allows keeping the
current API, that's the main benefit.

If it makes you more comfortable, maybe it would be reasonable to
restrict write(byte[]) to a maximum size when using it in async mode
(maybe to a multiple of the Servlet buffer size). Since the point of the
API is to be able to scale, since this usage pattern cannot scale
regardless of the technique that is being used, and since the user has
to write new code to use the async mode, it probably can be considered
[user education through IAE, yah].

[The "one GB array" is the main use of the memory mapped ByteBuffer -
since it's the only way to avoid killing the server -, with a new
write(ByteBuffer), but the consensus about it was "maybe later"; like
Tomcat's sendfile, it effectively bypasses the Servlet API IO]

> >> Also I'm not exactly sure how often WriteListener.onWritePossible is
> >> called? Is it called after every write? ie if I write 1 byte and
> >> there is not blocking will it be called?
> >
> > It will be called once, Rajiv clarified it. Presumably, after canWrite
> > has actually been called and returned false (otherwise, the Servlet
> > can't really know about the state).
>
> So if the app does not call canWrite, but keeps calling write(bytes)
> write(bytes) write(bytes), then the container will have to infinitely
> buffer that data?

The specification already says it is illegal to continue calling write
if it would block. Same for read if there's no content.

> Perhaps we need to say that once we are in async mode then canWrite
> must be called between every write. If it is not called, then write
> will throw ISE.

The spec does not say that an ISE is thrown, it only says it is
"illegal" at the moment.

In my current API/implementation, this is not illegal however: it simply
triggers blocking if you insist with multiple writes - or reads - after
the ready flag flips. So this is more sophisticated for some, but others
may not understand what they are doing.

> >> Finally, how does asynchronous mode interact with
> >> Response.resetBuffer, Response.setBufferSize etc. When writing
> >> nonBlocking, can we write data < the buffer size and then reset the
> >> buffer?
> >
> > What had been said previously was that the IO extensions would respect
> > the existing Servlet API buffering. Of course, that was when async for
> > reader/writer was still present.
> >
> > All this could be present in the spec document, of course, for better
> > understanding.
>
> Lots has been said previously one way or the other. The spec document

Since the API remains the same, unless it says that some behavior
changes, what is described elsewhere in the specification is unchanged.
Although I believe that things related to the threading model, the
state, and errors should be detailed more, I am not convinced this
particular item needs more attention.

-- 
Remy Maucherat <rmaucher_at_redhat.com>
Red Hat Inc