[jsr340-experts] Re: Async IO and Upgrade proposal updated

From: Remy Maucherat <rmaucher_at_redhat.com>
Date: Fri, 30 Mar 2012 17:38:46 +0200

On Sat, 2012-03-31 at 01:24 +1100, Greg Wilkins wrote:
> On 30 March 2012 21:35, Remy Maucherat <rmaucher_at_redhat.com> wrote:
> > No it's not better, it is much worse actually. An application using big
> > buffers is an easy DoS target. All you need is to read slowly and the
> > container has to sit on its buffer until the write is complete.
>
> That is only the case if the buffer is in the implementation and
> accumulates unrelated writes.
>
> But I'm proposing that the buffers written are created by the
> application to be large enough to contain the entity being sent. This
> could be a fixed sized frame of a protocol or a growing array buffer
> to convert an object to JSON. Such buffers do not grow when the
> consumer is slow. They are created by the application in order to
> write the entity.

Thanks for the clarification on the real purpose of what you are
proposing. It is then related to the suspend/resume model, with the
benefit of saving one thread over classic java.io when actually writing
the data.

As a more serious matter, I also don't see how this model fits the
upgraded connection IO requirements unless assumptions are made on the
protocol [which may work well enough for websockets].

> > Let's pretend that my example was with 20KB buffers then, that remains
> > reasonable.
> > ...
> >
> > I can do a single os.write(my20KBuffer); too ...
>
> No you can't. The max size of the write is imposed by the internal
> buffer of the implementation and the writer does not get to chose the
> size of the write. As you pointed out, the internal buffers are
> likely to be small, so you probably can't increase it to 20K!. Even if
> they know their entity is 4k+1 bytes, then they have to do it in 2
> writes if can writes returns 4k.

So, it is actually: yes I can.

The container will do non blocking writes until it exhausts the 20KB
buffer. Two options then:
a) If it doesn't manage to write all 20K bytes, it flips the canWrite
flag, and stores the remaining bytes that couldn't be read in a leftover
buffer. This uses some amount of memory (< 20 KB), but the application
buffer is given back to it (it may reuse it, or discard it, its choice).
Polling on the connection then occurs, and when the connection is
available for writing, the container will send the leftovers still using
non blocking writes (this is internal processing). When it is done, it
then it notifies the application with the WriteListener.
b) If all bytes have been written, then no callback, no nothing, no
cost, the application can fill its buffer and write again (if it needs
to).

So you also can see the application is aware of how many write listener
callbacks it has to go through for each connection, maybe allowing it to
do some quality of service decisions depending on the content when it
fails to accept data fast enough (ex: some video content would switch to
a lower bitrate).

> > No, the NIO 2 has one callback per write. So it has more callbacks, it is simple logic.
>
> If you can write a large entity in a single write, then you will have
> less callbacks with NIO.2 style than if you have to write it in many
> partial writes. If you have to write many small entities, then NIO.2
> style will have more callbacks. So unfortunately there is no
> convenient simple one-size-fits all solution.
>
> As you say, the EE expense on the call backs makes getting the right
> style for the right entities critical.

Of course, if you restrict users to a certain model if they want any
scalability gains, they'll have to use it. But it uses a large amount of
memory so the gains are relatively limited. We'd have yet another round
of incremental IO functional improvements in Servlet.next rather than be
"done" (ahem) after this one.

> > BTW, you seem most interested in pushing a solution that would require the least changes in your container.
>
> Not at all. We currently don't expose any async IO to the servlet
> container and either style is about the same effort to implement.
> However, I guess that makes my advocacy in the other thread for the
> option of having no async IO somewhat biased. Oh well, we all have
> our baggage.

Yes, I remember advocacy of async object serialization as a means to
generate responses.

-- 
Remy Maucherat <rmaucher_at_redhat.com>
Red Hat Inc