jsr369-experts@servlet-spec.java.net

[jsr369-experts] Re: Allow encoding to be set per web-app and per container

From: Stuart Douglas <sdouglas_at_redhat.com>
Date: Mon, 20 Feb 2017 07:51:47 +1100

Sounds reasonable to me.

Stuart

On Sat, Feb 18, 2017 at 7:42 AM, Shing Wai Chan
<shing.wai.chan_at_oracle.com> wrote:
> I am resuming the discussion from September regarding adding
> request/response encoding in web.xml. [1] [2]
>
> Since the default encoding for HTML5 is UTF-8, it would be good to
> - provide a way to configure the request/response encoding in a web application.
> - have an ability to configure the servlet container in a container specify way.
>
> I propose to have the following changes:
> - add <request-encoding> to web.xml schema
> A sample usage is as follows:
> <request-encoding>UTF-8</request-encoding>
>
> - add <response-encoding> to web.xml schema
>
> - update javadoc for ServletRequest#getCharacterEncoding() as follows:
> old:
> This method returns null if the request does not specify a character encoding
> new:
> This method returns null if no request character encoding has been
> specified. The following methods for specifying the
> request character encoding are consulted, in decreasing order of
> priority: perrequest, per web app (using deployment descriptor), and
> per container (using vendor specific configuration).
>
>
> - update javadoc for ServletResponse
> old:
> The charset for the MIME body response can be specified explicitly using the
> setCharacterEncoding(java.lang.String) and setContentType(java.lang.String) methods,
> or implicitly using the setLocale(java.util.Locale) method.
> Explicit specifications take precedence over implicit specifications.
> If no charset is specified, ISO-8859-1 will be used.
> new:
> The charset for the MIME body of the response can be specified
> using any of the following techniques: per request, per web-app (using
> deployment descriptor), and per container (using vendor specific configuration).
> If multiple of the preceding techniques have been employed, the priority is
> the order listed.
> For per request, the charset for the response can be specified explicitly using the
> setCharacterEncoding(java.lang.String) and setContentType(java.lang.String) methods,
> or implicitly using the setLocale(java.util.Locale) method.
> Explicit specifications take precedence over implicit specifications.
> If no charset is explicitly specified, ISO-8859-1 will be used.
>
> #getCharacterEncoding() as follows:
> old:
> The character encoding may have been specified explicitly using
> the setCharacterEncoding(java.lang.String) or setContentType(java.lang.String) methods,
> or implicitly using the setLocale(java.util.Locale) method.
> Explicit specifications take precedence over implicit specifications.
> new:
> The following methods for specifying the response character encoding are
> consulted, in decreasing order of priority: per request, per web-app (using
> deployment descriptor), and per container (using vendor specific configuration).
> The first one of these methods that yields a result is returned.
> Per-request, the charset for the response can be specified explicitly using the
> setCharacterEncoding(java.lang.String) and setContentType(java.lang.String) methods,
> or implicitly using the setLocale(java.util.Locale) method.
> Explicit specifications take precedence over implicit specifications.
>
> #setCharacterEncoding() as follows:
> old:
> If the character encoding has already been set by setContentType(java.lang.String) or
> setLocale(java.util.Locale), this method overrides it.
> new:
> If the response character encoding has already been set by the
> deployment descriptor, or using the setContentType() or setLocale()
> methods, the value set in this method overrides any of those values.
>
>
> - update 3.11 of spec
> old:
> The default encoding of a request the container uses to create the request reader and
> parse POST data must be “ISO-8859-1” if none has been specified by the client request.
> However, in order to indicate to the developer, in this case, the failure of the client
> to send a character encoding, the container returns null from the getCharacterEncoding method.
>
> If the client hasn’t set character encoding and the request data is encoded with a
> different encoding than the default as described above, breakage can occur.
> To remedy this situation, a new method setCharacterEncoding(String enc) has been added
> to the ServletRequest interface. Developers can override the character encoding supplied by
> the container by calling this method. It must be called prior to parsing any post data or
> reading any input from the request. Calling this method once data has been read will not
> affect the encoding.
> new:
> The default encoding of a request the container uses to create the request reader and
> parse POST data must be “ISO-8859-1” if none has been specified by the client request,
> deployment descriptor or per container using vendor specific configuration.
> However, in order to indicate to the developer, in this case, the failure of the client
> to send a character encoding, the container returns null from the getCharacterEncoding method.
>
> If the client hasn’t set character encoding and the request data is encoded with a
> different encoding than the default as described above, breakage can occur.
> To remedy this situation, the <request-encoding> element is available in the web.xml and
> the setCharacterEncoding(String enc) method is available on the ServletRequest interface.
> Developers can override the character encoding supplied by
> the container by adding the element or calling the method. It must be called prior to
> parsing any post data or
> reading any input from the request. Calling this method once data has been read will not
> affect the encoding.
>
> - update 5.5 spec
> old:
> If the element does not exist or does not provide a mapping, setLocale uses a container dependent mapping.
>
> new:
> The <response-encoding> element can be used to explicitly set the
> encoding for all responses.
> <response-encoding>UTF-8</response-encoding>
> If neither element exists or does not provide a mapping, setLocale uses a container dependent mapping.
>
> Please let me know your comments.
>
> Shing Wai Chan
>
> [1] https://java.net/jira/browse/SERVLET_SPEC-161
> [2] https://java.net/projects/servlet-spec/lists/jsr369-experts/archive/2016-09/message/26