[jsr369-experts] Re: [servlet-spec users] Re: [SPEC-163-getContextPath] DISCUSSION

From: Greg Wilkins <gregw_at_webtide.com>
Date: Wed, 14 Sep 2016 09:41:05 +1000

On 13 September 2016 at 19:09, Mark Thomas <markt_at_apache.org> wrote:

> I don't see a use case for HttpServletRequest#getContextPath() to be the
> original undecoded, unnormalized value.
> +1

> Also, I don't see how we can change this to be decoded without creating
> all sorts of problems, including security issues.
Exactly, and for that reason I don't believe that Jetty is likely to change
to support the spec as written. We need a compromise.... either in the
spec or jetty will unilaterally do something (see below) to get us closer
but without the security problem.

> Is it reasonable to simply deprecate HttpServletRequest#getContextPath()
> and point users towards
> HttpServletRequest#getServletContext()#getContextPath() instead? Could
> we go further and, as part of the deprecation, have
> HttpServletRequest#getContextPath() always return null? There would be
> breakage but it would be obvious.

There are two aspects of the spec as currently written:

   1. The context path returned is encoded (or more precisely non decoded).
   2. It is implied, but not explicit, that the encoding returned is that
   provided by the client, which may be non-normalized, variously encoded and
   containing path parameters.

I see no good reason for 1., but since the spec is written that way and
99.9999% of contexts don't need encoding, I don't think it is a big problem
to support. So we can return it encoded.

It is 2. that is a HUGE -1 for us, as it is both difficult and dangerous.
Hearing tomcats experience just confirms that we really should not do this
in jetty. I think the spec is ambiguous enough to give us some wiggle
room to improve this method with minimal breakage. How about:

The context path returned is encoded with the containers preferred encoding
for the context portion of the request URI

This keeps the current specified encoded return for the method, but it
gives the containers version rather than the unconstrained users provided
version. This still allows for a context to have multiple context paths,
and such a container will just have to have multiple preferred encodings
(or re-encode on the fly).

For the vast majority of webapps, they will see no difference as the
context path will not change when encoded. There is no security hole
opened up (or the existing one closed) as unconstrained user bytes are no
longer provided to the application. The implementation is not difficult.

I can't think of a reasonable webapp that will break with this change -
what webapp would depending on seeing different encodings of the same
context path provided by the client?


Greg Wilkins <gregw@webtide.com> CTO http://webtide.com