On 31.08.2015 13:01, Mark Thomas wrote:
> On 30/08/2015 20:19, Yannick Majoros wrote:
>> Hi,
>>
>> Uh, it's always been quite easy. Why do you think it isn't?
>>
>> You're citing Tomcat, which isn't Java EE btw.
>
> No, Tomcat isn't a full Java EE implementation but Tomcat implements the
> Servlet specification and this is the Servlet EG. Pointing out (using
> one of the many available Servlet implementations) that changing the
> default character encoding requires container specific configuration and
> asking for the specification to provide something doesn't seem unreasonable.
>
> The OP could have made the same point with Glassfish, WebSphere,
> WebLogic etc.
I chose Tomcat because it has a nice Wiki page that summarizes the issue
and links to the relevant specs. Also Tomcat serves as servlet
implementation for several JavaEE implementations.
>> For Servlet, it's up to you. As long as you don't rely on defaults, you
>> should be fine. JSPs, if you still use them have it quite clear too.
>
> And that is the point. If you want the default to be something other
> than ISO-8859-1 then it has to be changed in multiple places and you
> almost certainly need to use container specific configuration as well.
>
>> Everytime I've seen someone struggle with this, he used a framework that
>> made dumb assumptions (Struts anyone? That's not Java EE btw). Or the
>> developer himself was confused, relied on defaults or converted multiple
>> times...
>
> That is a little unfair. While I have also seen those sorts of errors
> there are also issues (covered in the Tomcat FAQ linked below) with
> non-spec compliant browser behaviour that contribute to the problem.
>
>> I'm curious, what do you want an "encoding" element in web.xml to do?
>
> That is a fair question. There are multiple things that you might want
> to change.
>
> 1. URI decoding
> You can't define this per web application since the URI needs to decoded
> before it is mapped to the web application. Therefore this has to be a
> container wide setting which means this pretty much has to use container
> specific configuration.
> What we could do is make UTF-8 rather than ISO-8859-1 the default.
>
> 2. Response bodies
> A web.xml setting could be used to change from the current ISO-8859-1
> default to a default of UTF-8.
>
> 3. Request bodies
> A web.xml setting (the same as 2?) could be used to change from the
> current ISO-8859-1 default to a default of UTF-8.
At minimum 1 because that currently requires container specific
configuration. I don't think just having UTF-8 is enough as long as
browsers use ISO-8859-1 for ISO-8859-1 web pages.
Ideally also 2. Adding a filter to webapps for fixing 2 because browsers
don't send the encoding is doable and portable just little bit annoying.
Personally I can live without 3 however a central place to configure
everything would be nice.
4. Make it clear from the spec what the default is so that implementors
agree what the default is. Ideally cover this by the TCK.
Cheers
Philippe