jsr369-experts@servlet-spec.java.net

[jsr369-experts] Re: [servlet-spec users] [SPEC-163-getContextPath] DISCUSSION (was: Re: encoded context path!?!?)

From: Greg Wilkins <gregw_at_webtide.com>
Date: Tue, 13 Sep 2016 11:05:47 +1000

On 13 September 2016 at 09:07, Edward Burns <edward.burns_at_oracle.com> wrote:

> >>>>> On Thu, 8 Sep 2016 15:24:59 +1000, Greg Wilkins <gregw_at_webtide.com>
> said:
>
> GW> According to the javadoc
> GW> <http://docs.oracle.com/javaee/6/api/javax/servlet/
> http/HttpServletRequest.html#getContextPath()>,
> GW> the container does not decode the string, so it would not return
> "/context
> GW> path", rather it should return "/content%20path", or even "
> GW> /c%6Fntent%20path" if that is what the client sent!
>
> Considering the fundamental nature of this one, I want to be sure I am
> understanding you correctly. Greg, prior to your discovery of this
> issue, Jetty *was* performing decoding before returning the result from
> HttpServletRequest.getContextPath(). Is that correct?
>


Jetty was and *is for now* returning decoded paths for

   - ServletContext#getContextPath()
   - HttpServletRequest#getContextPath()
   - HttpServletRequest#getServletPath()
   - HttpServletRequest#getPathInfo()

Jetty was and *is for now* returning encoded paths for

   - HttpServletRequest#getRequestURI()
   - HttpServletRequest#getRequestURL()




> GW> Luckily such encoded context paths are very rare!
>
> GW> However, we would like to be compliant, so we are fixing this, but I'd
> just
> GW> like to check regarding ServletContext.getContextPath(), as its
> javadoc
> GW> does not mention if it is encoded or not. I would assume that it
> GW> is?
>
> Indeed it does not, and given that the rest of the text is verbatim the
> same as HttpServletRequest.getContextPath(), I agree it probably
> should. I have filed SPEC-163 for this.
>
>
I definitely believe that ServletContext#getContextPath() and
HttpServletRequest#getContextPath() should be consistent in what they
return.

The question is, what should that be and how should it be specified.

If they are both to return decoded paths, then it does not make sense to
say "does not decode" in the ServletContext version, as it was probably
never encoded in the first place. It would have to say "does encoded" or
similar to make sense.

However, I think it is a not reasonable for getContextPath to return
encoded URI in either cases, more specifically I believe there are security
ramifications of returning non decoded non normalized user supplied data
from HttpServletRequest#getContextPath().

If it really has to return an encoded path, it should be the containers
preferred encoding or a normalized encoding rather than whatever attack
vector the client has supplied in an attempt to trick any application that
uses the contextPath




> >>>>> On Thu, 8 Sep 2016 17:01:08 +1000, Stuart Douglas <
> sdouglas_at_redhat.com> said:
>
> SD> Looks like Undertow also has the same issue. If it turns out that
> SD> everyone has implemented it the same (wrong) way maybe we should look
> SD> at changing the spec instead.
>
> I looked at what we do in GlassFish and we just return what was set, no
> decoding is performed.
>
>
But set from what? Set from the contextPath matching? If so that is done
after decoding and normalization, so it will be a decoded path. It is
really really difficult to get the encoded part of the path that matches a
context path - it will be a lot complex code and hard to miss. Consider
the example:

   /good/context/.//..%2F..//./%45%56%49%4C/context/servlet/info

If the contexts are /good/context and /EVIL/context, then this should match
match /EVIL/context as the context path. The spec as currently written
wants /good/context/.//..%2F..//./%45%56%49%4C/context returned as the
context path. The code to do that is rather complex!

If context matching is done during URI decoding, then there is also a
significant risk that such a URI would match a context of /good unless
double decoding is done: once to do the match and then again to determine
which part of the URI actually matched the given context path.



>
> The only spec change Greg is requesting is to make the text in
> ServletContext.getContextPath() be the same as in
> HttpServletRequest.getContextPath() by including the statement: "The
> container does not decode this string." I have captured just this bit
> in SPEC-163. This change will not lead to any breakage.
>
>
Not exactly:

   - I'm initially asked for clarity and consistency on
   ServletContext#getContextPath()
   - Then when subsequently I realized the full horror of doing the match
   on encoded paths I advocated that we
   change HttpServletRequest#getContextPath() to return the decoded path.
   - Finally I noted that if the getContextPath() methods do return the
   encoded form, we have no method to get the decoded contextPath of a
   request. Furthermore that I do not trust applications to correctly handle
   the full nastiness of decoding/normalization.


Well , here's another fine mess we've gotten ourselves into :(


-- 
Greg Wilkins <gregw@webtide.com> CTO http://webtide.com