[jsr369-experts] Re: [servlet-spec users] [SPEC-163-getContextPath] DISCUSSION

From: Mark Thomas <markt_at_apache.org>
Date: Tue, 13 Sep 2016 10:09:22 +0100

On 13/09/2016 02:05, Greg Wilkins wrote:

> Jetty was and *is for now* returning decoded paths for
> * ServletContext#getContextPath()
> * HttpServletRequest#getContextPath()
> * HttpServletRequest#getServletPath()
> * HttpServletRequest#getPathInfo()
> Jetty was and *is for now* returning encoded paths for
> * HttpServletRequest#getRequestURI()
> * HttpServletRequest#getRequestURL()

Tomcat is largely the same with the following differences / clarifications:

- HttpServletRequest#getContextPath() is encoded
- path parameters are stripped from any decoded values

> GW> Luckily such encoded context paths are very rare!
> GW> However, we would like to be compliant, so we are fixing this,
> but I'd just
> GW> like to check regarding ServletContext.getContextPath(), as its
> javadoc
> GW> does not mention if it is encoded or not. I would assume that it
> GW> is?
> Indeed it does not, and given that the rest of the text is verbatim the
> same as HttpServletRequest.getContextPath(), I agree it probably
> should. I have filed SPEC-163 for this.
> I definitely believe that ServletContext#getContextPath() and
> HttpServletRequest#getContextPath() should be consistent in what they
> return.
> The question is, what should that be and how should it be specified.
> If they are both to return decoded paths, then it does not make sense to
> say "does not decode" in the ServletContext version, as it was probably
> never encoded in the first place. It would have to say "does encoded"
> or similar to make sense.
> However, I think it is a not reasonable for getContextPath to return
> encoded URI in either cases, more specifically I believe there are
> security ramifications of returning non decoded non normalized user
> supplied data from HttpServletRequest#getContextPath().

Huge +1.

The most notable problem we had in Tomcat when we started providing the
original, undecoded, unnormalized value for
HttpServletRequest#getContextPath() was with applications that were
using this to make security decisions. Those applications were assuming
that a decoded, normalized value would be returned and they broke badly
(i.e the security constraints were bypassed) when Tomcat started
following the spec.

They fixed this by switching to ServletContext#getContextPath()


> The only spec change Greg is requesting is to make the text in
> ServletContext.getContextPath() be the same as in
> HttpServletRequest.getContextPath() by including the statement: "The
> container does not decode this string." I have captured just this bit
> in SPEC-163. This change will not lead to any breakage.

Adding "The container does not decode this string." to
ServletContext.getContextPath() would lead to massive breakage.

> Not exactly:
> * I'm initially asked for clarity and consistency on
> ServletContext#getContextPath()
> * Then when subsequently I realized the full horror of doing the match
> on encoded paths I advocated that we
> change HttpServletRequest#getContextPath() to return the decoded path.
> * Finally I noted that if the getContextPath() methods do return the
> encoded form, we have no method to get the decoded contextPath of a
> request. Furthermore that I do not trust applications to correctly
> handle the full nastiness of decoding/normalization.
> Well , here's another fine mess we've gotten ourselves into :(

I don't see a use case for HttpServletRequest#getContextPath() to be the
original undecoded, unnormalized value.

Also, I don't see how we can change this to be decoded without creating
all sorts of problems, including security issues.

Is it reasonable to simply deprecate HttpServletRequest#getContextPath()
and point users towards
HttpServletRequest#getServletContext()#getContextPath() instead? Could
we go further and, as part of the deprecation, have
HttpServletRequest#getContextPath() always return null? There would be
breakage but it would be obvious.