[jsr369-experts] Thread Safety Guarantees and Requirements (was: Re: [servlet-spec users] Re: Clarification of threading requirements of section 6.2.3)

From: Edward Burns <edward.burns_at_oracle.com>
Date: Fri, 30 Jan 2015 14:45:53 -0800

This email is a recap of the disssion on this thread. After this
thread, we need to followup and do more work along the lines proposed by
Greg in his lightweight taxonomy (tagged in this email as GREG'S
LIGHTWEIGHT TAXONOMY)

>>>>> On Mon, 08 Dec 2014 12:09:16 +0000, Mark Thomas <markt_at_apache.org> said:

[...]

MT> To some extent I am playing devil's advocate here. The original bug
MT> report has highlighted a lack of clarity within the servlet spec about
MT> when it is legal to switch to async and when it isn't. It is my position
MT> that this must be addressed in the next revision of the servlet spec.

MT> I do not have strong views on exactly how it is clarified but my
MT> starting position is that we should not limit when it is legal to switch
MT> to async without a good reason.

>>>>> On Tue, 9 Dec 2014 19:08:37 +0100, Greg Wilkins <gregw_at_intalio.com> said:

GW> My point is that we have got away with this vague spec so far because we
GW> basically say that the request/response objects are not thread safe and if
GW> you want thread safety for them then use them only from a container managed
GW> thread. We have made an exception to that for the input/output streams.

GW> If we allow async threads to call doFilter, then we are essentially saying
GW> that all APIs can be called by an async thread. So we have a lot of work
GW> do to to clarify the multi-threaded behaviour of the whole API which has
GW> previously just been declared thread unsafe.

>>>>> On Wed, 10 Dec 2014 10:20:23 +0900, Eugene Chung() <euigeun_chung_at_tmax.co.kr> said:

SD> I guess my main concern here is that there is not really any way to hand
SD> off to another thread so that only one thread can be using the
SD> request/response at a time.
SD> Basically I am talking about some kind of AsyncContext.delayStart()
SD> construct that will not dispatch the task until the current call stack has
SD> returned to the container (or just change the semantics of start() to
SD> match). This will make sure that only one thread is using the request and
SD> response, without needing to resort to synchronization or other thread
SD> safety constructs.

EC> +1

>>>>> On Fri, 12 Dec 2014 12:38:57 +0100, Greg Wilkins <gregw_at_intalio.com> said:

GW> Stuart,

GW> perhaps we just need another listener or another callback on the
GW> ServletRequestListener that gives a callback when the dispatch has returned
GW> to the container and the request is truly suspended?

GW> We already have onWritePossible and onDataAvailable providing this semantic
GW> if you are doing asyncIO, so we just need a callback if you are doing pure
GW> async.

GW> Apps can then start their async processing from this callback.

>>>>> On Sun, 14 Dec 2014 17:09:22 -0500 (EST), Stuart Douglas <sdouglas_at_redhat.com> said:

SD> This sounds reasonable to me, assuming that we can use a default
SD> method on the ServletRequestListener to maintain backwards
SD> compatibility (although this will obviously only work with
SD> JDK8). I'm also not 100% convinced that ServletRequestListener is
SD> the best place for this, AsyncListener seems like it would make more
SD> sense, as this callback is only really relevant for async requests.

GW> But note that it is still not safe to call many request/response
GW> methods.

As someone who is new to this expert group, such broad statements are
appealing to me because they are understandable without having to parse
out such detailed state.

GW> What should getContextPath return when the request is not in a
GW> context and the context that suspended it may not be the one
GW> indicated by the URI?

>>>>> On Sun, 14 Dec 2014 17:09:22 -0500 (EST), Stuart Douglas <sdouglas_at_redhat.com> said:

SD> I would assume it would return whatever it would have returned just
SD> before the request returned to the container, and would only change
SD> if the user called dispatch().

EC> For AsyncContext.dispatch methods, there is the explanation.

EC> " .... If this method is called before the container-initiated dispatch
EC> that called startAsync has returned to the container, the dispatch
EC> operation will be delayed until after the container-initiated dispatch has
EC> returned to the container. ..... "

EC> I also think AsyncContext.start() should have same requirement.

>>>>> On Wed, 10 Dec 2014 10:04:32 +0000, Mark Thomas <markt_at_apache.org> said:

MT> I have no objections to that. It removes a lot of potential complexity
MT> around thread-safety between the container thread and the application
MT> thread.

>>>>> On Thu, 18 Dec 2014 19:23:20 -0800, Shing Wai Chan <shing.wai.chan_at_oracle.com> said:

SWC> Let me try to summarize the discussion.
SWC> 1. There is a concern on whether FilterChain#doFilter can be called by a
SWC> thread other than the container invocation thread.
SWC> Similar question for RequestDispatcher#dispatch.
SWC> 2. What is the return values of #getContextPath, #getServletPath,
SWC> #getPathInfo in case of async?

SWC> For (1),
SWC> From Servlet 3.1: A Filter and the target servlet or resource at the
SWC> end of the filter chain must execute in the same invocation thread.
SWC> This implicitly implies that FilterChain#doFilter should not be called
SWC> by a thread other than the container invocation thread.
SWC> We would like to make this more explicit in the spec by adding the
SWC> following in javadoc of FilterChain#doFilter

SWC> @throw IllegalStateException if it is invoked by a thread other
SWC> than the container invocation thread

PROPOSED CHANGE ^^^^^^^^^^^^^^^^^^^^^^^^

>>>>> On Wed, 31 Dec 2014 17:17:59 +0100, Greg Wilkins <gregw_at_intalio.com> said:

GW> Shing Wai, that is a reasonable summary.

>>>>> On Fri, 19 Dec 2014 09:05:05 +0000, Mark Thomas <markt_at_apache.org> said:

MT> I agree that this would make the specification clearer in this area.

GW> I agree the ISE is a good addition.

MT> However, I don't see why we have this restriction in the first
MT> place.

I expect it's because the EG felt we couldn't guarantee the correctness
of the programming model if we allowed it. In other words, we didn't
want to sign up to have to explain when it would really be safe to do
this, so we just said it's not allowed at all. I wasn't there at the
time, so I am just guessing.

MT> As I have stated elsewhere on this thread, no explanation was given
MT> when this restriction was introduced and right now I don't see why
MT> we need it for sync or async.

>>>>> On Wed, 31 Dec 2014 17:38:19 +0100, Greg Wilkins <gregw_at_intalio.com> said:

GW> I agree that this issue is really independent of sync or async.

GW> However, I do not believe it was a restriction that was added - only
GW> a clarification of the expected behaviour. The reasons I believe
GW> that included:

GW> - There is an expectation that exceptions thrown by a
GW> servlet/filter may be caught by an upstream filter or servlet
GW> using a RequestDispatcher. Allowing arbitrary threads to call
GW> into doFilter will break this.

MT> That expectation is going to break whether async is started in the
MT> servlet or a filter.

GW> - ThreadLocals are frequently use by authentication and
GW> authorisation mechanisms, specially when invoking JEE components.
GW> Allowing the filter chain to not be scoped by a thread will break
GW> all usage of thread locals.

The proposed looseness is very problematic for ThreadLocals

MT> ThreadLocals are already broken as soon as async moves to a new thread.
MT> This just 'enables' them to be broken earlier in the processing chain.

GW> - Code within a filter/servlet chain can expect to be single
GW> threaded with respect to a particular request. ie while there may
GW> be many threads within a filter/servlet, there will only be a
GW> single thread per request. Allowing arbitrary threads to call
GW> doFilter would allows for the possibility of a request being
GW> dispatched multiple times.

MT> An application could do this now by calling Servlet.service() from
MT> within an async thread. We already have the possibility of a request
MT> being dispatched multiple times.

That seems like a contrived example, since by convention only the
container calls service(), even though it's just another method and
*could* be called by application code.

GW> - If a new thread calls doFilter before the dispatched thread
GW> returns to the container, then the values of methods like
GW> getPathInfo will be different for each calling thread.

MT> That depends on how the behaviour of getPathInfo() is defined for async
MT> doesn't it?

GW> - Such a pattern does not scale to multiple uses. If filter A
GW> used an new thread to call filter B which used a new thread to
GW> call filter C, then what can filter A assume when it's new thread
GW> returns? Is the request handled and the response generated? It
GW> does not know because that depends on thread created by B. To
GW> resolve such issues, we would need to create a whole bunch of
GW> listeners, that we pretty much have already for normally
GW> dispatched requests.

Right. Essentially it's too hard to guarantee correctness in light of
all that could be done, and if you can't guarantee correctness, we
should not allow it, even if it could be useful in some cases.

I'm reminded of this text from Brian Goetz's well regarded Java
Concurrency in Practice:

  4.5.1 Interpreting Vagua Documentation

  Many Java technology specifications are silent or at least
  unforthcoming about thread safety guarantees and requirements for
  interfaces such as ServletContext, HttpSession, or DataSource [9].
  Since these interfaces are implemented by your container...

  [9] We find it particularly frustrating that these omissions persist
  despite multiple major revisions of the specifications.

When you have a popular print book calling you out on the carpet, it
really is time to take action.

MT> The original use case [1] that triggered this thread looks reasonable
MT> and I'd like to support it if possible. The clarification proposed above
MT> would not allow the use case to be supported.

GW> I'm not so sure it is a valid use case. The issue is that a different
GW> threading paradigm (fibers from quasar) want to be used to invoke
GW> servlets. I think changing the very nature of the threading of a servlet
GW> container has to be something done with the involvement of the container
GW> and not as a filter in an application.

GW> If the application wishes to use fibers to generate a response, then they
GW> are free to create a framework to do so that writes (async or blocking) to
GW> the servlet response. But I don't see why it is reasonable that such a
GW> new/different threading paradigm should expect to be able to takeover the
GW> dispatch and handling model of the servlet container. Specially if
GW> authentication and authorisation are involved.

I defer to Greg's judgement here, but generally I think we should be
conservative when judging whether a questionable use of the spec is
allowable or not.

MT> Regarding 2) my current thinking is that the return values are as they
MT> were for the request/response at the point async was started and remain
MT> unchanged until the async request is dispatched at which point they
MT> change to reflect the target of the dispatch. It is TBD if some
MT> mechanism is provided along the lines of 9.4.2 to expose some of the
MT> original (remembering there may be multiple async/dispatch sequences)
MT> values.
MT>

Doesn't this amount to us publishing information from one thread to
another?

GW> That is a big change in the current semantic. Currently after a
GW> doFilter/forward/include the forwarding/including filter/servlet can
GW> expect to be able to access those methods and see the values that
GW> were there before the dispatch. I have definitely seen application
GW> code that accesses the request methods on the way out that will
GW> break if a call to startAsync suddenly freezes those values.

Right, changing these semantics would be confusing and error prone.

MT> While I don't agree with many of the points you have made above, you
MT> have convinced me that this particular use case [1] isn't a valid
MT> one. Absent a valid use case, I am happy with the clarification
MT> that an ISE should be thrown.

For reference, this is flagged as PROPOSED CHANGE above.

MT> However, many of the points you raise above are not unique to calling
MT> FilterChain.doFilter() from an async thread and apply equally to calling
MT> Servlet.service(). Do we want to add a similar restriction (although I
MT> don't think the container could enforce it) that Servlet.service() must
MT> only be called by the 'container invocation thread'?

Sure, we can do that.

>>>>> On Thu, 01 Jan 2015 11:40:43 +0000, Mark Thomas <markt_at_apache.org> said:

MR> Agreed. I only wonder if the text from the last sentence of 6.2.3 should
MR> not be moved/repeated elsewhere as it is fairly fundamental and seems a
MR> little buried in the weeds at the moment.

MT>

>>>>> On Thu, 1 Jan 2015 11:50:23 +0100, Greg Wilkins <gregw_at_intalio.com> said:

GW> But rather than try to restrict what threads can call service, let's just
GW> define what APIs are valid for what threads.

GW> We essentially have 3 classes of calling environments: Container
GW> dispatched threads, Container anointed thread (ones within
GW> AsyncContext.start()) and other threads. We need to define what
GW> methods can be called from which contexts.

Yes, this is the sort of work that needs to be done to clear our name in
Brian Goetz's book.

GW> Basically we need to go through the API and work out which methods
GW> are safe to call from an async thread (eg. Request attributes,
GW> Response setHeader, flushBuffer, getOutputStream, etc.); which can
GW> be called from an anointed thread (EJBs?, JNDI?); and which are only
GW> valid from a container dispatched thread (eg request
GW> getServletContext, getRequestDispatcher etc.)

GREG'S LIGHTWEIGHT TAXONOMY ^^^^^^^^^^^^^^^^

I like using a lightweight taxonomy to help complete this work. To that
end, I have created a task on the Adopt-a-JSR tab [2]. Let's see if
anyone picks it up.

Ed

-- 
| edward.burns_at_oracle.com | office: +1 407 458 0017
| 26 days til DevNexus 2015
| 36 days til JavaLand 2015
| 46 days til CONFESS 2015
[1] https://issues.apache.org/bugzilla/show_bug.cgi?id=57284
[2] https://jcp.org/en/egc/view?id=369