jsr369-experts@servlet-spec.java.net

[jsr369-experts] Re: [servlet-spec users] Re: Race between error page dispatch and async servlet

From: Stuart Douglas <sdouglas_at_redhat.com>
Date: Thu, 6 Apr 2017 15:43:28 +1000

On Thu, Apr 6, 2017 at 3:14 PM, Greg Wilkins <gregw_at_webtide.com> wrote:
> Stuart,
>
> Ah good idea to add such an IO event to the input queue so it occurs once an
> IO operation is attempted. This would indeed help with the most immediate
> manifestation of the problem we have.
>
> However, many of our http/2 users are interested in fail fast semantics that
> are available with HTTP/2 and we have discussed here how such errors can be
> delivered.... perhaps the can be delivered to any AsyncListener onError if
> registered and then in any subsequent IO operations.

I think this may need to be a new listener, as this is a new semantic
with regards to thread safety that may cause problems.

In particular at the moment onError is only called after a dispatch()
call, so the Request/Response objects are basically 'owned' by the
thread doing the dispatch, and the onError call will be called by that
thread (there could still be other user threads active, however then
it is the users responsibility to synchronize access to the
Request/Response objects).

Invoking the onError method in a different thread due to connection
failure could break existing listeners that are not coded in a thread
safe manner.

In addition to this every mechanism by which we can get a 'fail fast'
result implies that the underlying connection is broken and no
response can be sent. As a result it is pointless to generate an error
page, which I think most existing onError implementation will be
trying to do.

If users want fail fast support I think we need to explicitly add
support for it, otherwise failures should be reported when the user
attempts IO as normal (with the potential for container specific
support for fail fast).

Stuart

>
> But note that even if we so handle such IO events, I think it is still
> possible for the problem to occur when handling an exception thrown from a
> dispatch to a servlet. The container still needs to generate an error page,
> but the application may have already commenced an async write mechanism that
> can race with the error page.

In this case I do not start the async write process until the
dispatch() call stack has returned, which side steps this problem.

Stuart

>
> cheers
>
>
>
>
> On 6 April 2017 at 14:05, Stuart Douglas <sdouglas_at_redhat.com> wrote:
>>
>> On Thu, Apr 6, 2017 at 1:37 PM, Greg Wilkins <gregw_at_webtide.com> wrote:
>> >
>> > We are trying to deal with a race condition in Jetty that we believe is
>> > kind
>> > of inherent in the spec rather than our handling of it. It is more
>> > apparent with HTTP/2 usage but I think that is can also happen with
>> > HTTP/1
>> > is very specific circumstances. I'm wondering of other containers have
>> > encountered this problem and perhaps if there is something we can do to
>> > help
>> > avoid it.
>> >
>> > The issue is that if the application is using async to generate a
>> > response
>> > from a non container thread, then there is an intrinsic race condition
>> > if an
>> > error occurs and an error page needs to be generated.
>> >
>> > On HTTP/1 and error page would only need to be generated if a prior
>> > dispatch/callback throws an exception. With HTTP/2 error pages
>> > dispatches
>> > can occur at any time because of a packet that has been received over
>> > the
>> > stream (which is continuously being read).
>>
>> Unless I have misunderstood what you mean this is not possible in our
>> implementation. If a packet that results in an error (e.g. a
>> RST_STREAM) is read from the wire the application will not know about
>> it until it attempts an IO operation, where it will receive an
>> IOException it can handle however it wants. This is similar to the
>> situation in HTTP/1.1, if the remote peer has sent a RST packet then
>> the application will not know about it until it attempts IO.
>>
>> I think the spec should prohibit the container from arbitrarily
>> calling sendError based on async IO events, as that way lies a
>> minefield of threading issues that will basically require all state
>> relating to the response to be thread safe with the associated
>> performance penalty.
>>
>> Stuart
>>
>> >
>> > The problem in more detail is that to generate an error page the
>> > container:
>> >
>> > checks if the response is committed - if so abort connection as best as
>> > possible (but may look like a normal EOF in HTTP/1.0)
>> > clears the response buffer
>> > sets the response status code/reason
>> > sets some response attributes to describe the error
>> > Either generates a page itself or does an DispatcherType.ERROR dispatch
>> > to
>> > the application to generate the error page.
>> >
>> > However in parallel the application may have async mechanisms that can
>> > mutate the response by:
>> >
>> > write to the response
>> > commit the response either by writing or by explicit flush
>> > set the response status code/reason
>> > call AsyncContext.complete() or other methods
>> >
>> > Now the handling for the last of these is well defined, because we can
>> > have
>> > an exact before/after relationship between the call to complete and the
>> > decision to do an error dispatch.
>> >
>> > However, any of the other mutating methods may happen concurrently with
>> > the
>> > container/application generating an error response. So all sorts of
>> > nasty
>> > outcomes are possible: async write content in the middle of the error
>> > page,
>> > async status code sent instead of error page; application response
>> > written
>> > with error status but normal content, illegal state exception when
>> > resetting
>> > buffers because response committed asynchronously etc. etc. The exact
>> > form
>> > of the disaster is container dependent as the order things are done for
>> > the
>> > error page is not defined.
>> >
>> > Now currently this is not too much of a huge issue, as the kinds of
>> > errors
>> > that we may get are typically fatal to the connection/stream (eg
>> > receiving
>> > a HTTP/2 stream reset), so an error page is unlikely to be successfully
>> > generated, but we still need to make the effort to inform the
>> > application
>> > that an error has occurred even if a page cannot be generated. Plus
>> > also in
>> > future there may be other error conditions from a HTTP/2 connection
>> > violating some server imposed constraints that do require a error page
>> > to be
>> > generated.
>> >
>> > We could try to handle this with some kind of mutual exclusion on the
>> > response methods, but generating an error page is a multi operation
>> > process
>> > that can call the application, so holding a lock while calling
>> > application
>> > code is just asking for problems.
>> >
>> > A better approach for Servlet 4.0 may be to discourage arbitrary threads
>> > from being able to call the response API. Instead we could have a
>> > variant
>> > of AsyncContext.start(Runnable), that guarantees that the Runnable will
>> > not
>> > be run concurrently with any container dispatched thread on the
>> > associated
>> > transaction. We already give that guarantee for container callback and
>> > dispatches, but it is impossible for an async thread to enforce non
>> > concurrency with any container action and/or container managed
>> > callbacks.
>> >
>> > So lets call the method AsyncContext.run(Runnable) and if called when no
>> > container managed thread is active, then the Runnable would be called
>> > immediately by the calling thread (in context scope like start),
>> > otherwise
>> > the Runnable would be queued to be called by a container managed thread
>> > immediately after the container dispatch is complete and no other
>> > container
>> > dispatch would occur until after the Runnable returns.
>> >
>> > Thoughts
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Greg Wilkins <gregw@webtide.com> CTO http://webtide.com
>
>
>
>
> --
> Greg Wilkins <gregw@webtide.com> CTO http://webtide.com