jsr369-experts@servlet-spec.java.net

[jsr369-experts] Re: [servlet-spec users] [146-URIEncoding] DISCUSSION

From: Greg Wilkins <gregw_at_webtide.com>
Date: Thu, 30 Mar 2017 10:47:20 +1100

On 29 March 2017 at 19:22, Edward Burns <edward.burns_at_oracle.com> wrote:

>
> I looked at the HTTP/1.1 and 2.0 RFCS and the URI RFC and didn't see any
> text about this. Can you please say more about what you mean here?
>

If the RFCs were precise about character encoding of URIs then we wouldn't
be in such a mess :)
I have definitely had to deal with URIs sent in utf-8 with JIS % encodings
- but thankfully not for a while.


> Regardless of the answers to the above questions, I suggest we simplify
> this and do not support configuring different encodings for the
> different parts of the request: path, query string, and body.


On reflection I agree. While I have seen such mixed encodings, I'm sure
they are dubious against some RFC or other. Applications that wish to
support mixed encodings are free not to set a default encoding and to
continue using whatever container specific behaviour they currently are.

I broadly support your text with the exception below:



I propose
> we implement this suggestion as follows.
>
> * Modify request-character-encoding element in
> javaee8/src/web-app_4_0.xsds to be:
>
> <xsd:element name="request-character-encoding" type="javaee:string">
> <xsd:annotation>
> <xsd:documentation>
>
> When specified, this element provides a default request
> character encoding of the web application. This request
> character encoding value pertains to all aspects of reading
> octets from the request, including but not limited to, the
> URI path, the query string, and any request body content.
>
>
I don't think it is "pertains to all aspects of reading octets from the
request" is that clear as it could be interpreted to mean when reading the
request from the network protocol. For example they cannot set ebcdic
encoding and expect to be able to send HTTP methods and headers so encoded
etc.

How about "pertains to all conversion to strings of octets transported by
the request that do not have an otherwise known encoding"

cheers




-- 
Greg Wilkins <gregw@webtide.com> CTO http://webtide.com