users@glassfish.java.net

Re: why the default response character encoding must be ISO-8859-1?

From: Jan Luehe <Jan.Luehe_at_Sun.COM>
Date: Wed, 18 Mar 2009 19:59:06 -0700

Ken,

On 03/18/09 07:37 AM, Ken--_at_newsgroupstats.hk wrote:
> Dear,
>
> Is that character encoding for request? What I am talking is encoding of
> response. Am I correct?
>
> Please find attached jsps. utf8-test1.jsp and utf8-test2.jsp.
>
> http://www.nabble.com/file/p22579795/utf8-test1.jsp utf8-test1.jsp
> http://www.nabble.com/file/p22579795/utf8-test2.jsp utf8-test2.jsp
>
> utf8-test1.jsp is working with tomcat only, gf3 will show distorted words.
> The page encoding of utf8-test1.jsp under gf3 is ISO-8859-1. utf8-test2.jsp
> are working with gf3 and tomcat6.
>
> my webapps contents are latin encoding. In order to work with gf3, I have to
> convert all content encoding to utf-8:
>
> str = new String(str.getBytes("iso-8859-1"), "utf-8");
>
> I got this problem when I was using tomcat 5.0.28. Here is the change log of
> tomcat5.0:
>
> http://web.archive.org/web/20071012233252/http://tomcat.apache.org/tomcat-5.0-doc/changelog.html
>
> If ServletResponse.getWriter() is called and no char encoding has been
> specified, set response char encoding to default (ISO-8859-1) so that it is
> reflected in getContentType() and Content-Type header, as required by the
> Servlet Spec (Bugtraq 6152759) (luehe)
>
> I did ask tomcat team and they replied that they just follow servlet spec
> 2.0.
>
> The problem is gone in Tomcat 6.0. (I didn't try tomcat 5.5).
>
> Today, I have to migrate my webapps from tomcat6 to gf3. Please advise any
> config or setting can solve the problem without convert all content to utf8.
>
>

When I access utf8-test1.jsp on GlassFish, I get this response:

  Content-Type: text/html;charset=ISO-8859-1

whereas on Tomcat 6, I get:

  Content-Type: text/html

"charset=ISO-8859-1" is missing from the Content-Type returned
by Tomcat.

Tomcat may be compliant with Servlet 2.0, but it is *not* compliant
with Servlet 2.5.

In Servlet 2.5, ServletResponse#setCharacterEncoding was enhanced
as follows:

  * <p>Containers must communicate the character encoding used for
  * the servlet response's writer to the client if the protocol
  * provides a way for doing so. In the case of HTTP, the character
  * encoding is communicated as part of the <code>Content-Type</code>
  * header for text media types.

and ServletResponse#getWriter was amended as follows:

  * If the response's character encoding has not been
  * specified as described in <code>getCharacterEncoding</code>
  * (i.e., the method just returns the default value
  * <code>ISO-8859-1</code>), <code>getWriter</code>
  * updates it to <code>ISO-8859-1</code>.

for this reason:

  Specifying iso-8859-1 explicitly for text media types is necessary
  because many clients don't follow the HTTP spec in applying this
  default - they often use a default encoding based on the locale
  they're running in or based on a user preference. By telling these
  clients explicitly what we're really using we increase our chances of
  having the text processed or displayed correctly.

As you can see, GlassFish follows the Servlet 2.5 spec, whereas Tomcat
does not. It is time that a test for this issue is added to the Servlet
compliance
suite, in which case we would not have had this discussion.

Why can't you add this page directive:

  <%_at_page contentType="text/html;charset=utf8"%>

to your JSPs, as you did for utf8-test2.jsp?

You could actually declare this page directive in a JSP prelude,
in which case you would not have to declare it on each and every JSP.

Jan


> Regards,
> Ken
>
>
>
>
> Felipe Gaucho wrote:
>
>> fro Glassfish, the following steps are used to set the encoding:
>>
>> * The getCharacterEncoding() method
>> * A hidden field in the form, specified by the
>> form-hint-field attribute of the parameter-encoding element in the
>> sun-web.xml file
>> * The default-charset attribute of the parameter-encoding
>> element in the sun-web.xml file
>> * The default, which is ISO-8859-1
>>
>> On Tue, Mar 17, 2009 at 5:27 PM, Ken--_at_newsgroupstats.hk
>> <dragonken_at_gmail.com> wrote:
>>
>>> I am trying to migrate my webapps from tomcat 6.0 to glassfish v3... but
>>> I
>>> found the my utf-8 pages are not working well with gf3.
>>>
>>> my jsps like this:
>>>
>>> <%_at_page contentType="text/html"%>
>>> <html>
>>> <head>
>>> <META HTTP-EQUIV="Content-type" CONTENT="text/html; charset=utf-8">
>>> <title>...</title>
>>> </head>
>>> <body>
>>> content
>>> </body>
>>> </html>
>>>
>>> Those pages are working well with apache + tomcat 6.0 but with gf3,
>>> Chinese
>>> characters are distorted and I have to right click and manually adjust
>>> the
>>> Encoding to Unicode (UTF-8) to display Chinese words correctly. I checked
>>> that the page encoding was somehow adjusted to iso-8859-1.
>>>
>>> I remember that exactly same problem was happened with tomcat 5.0.28.
>>>
>>> Any idea?
>>>
>>> Ken
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/why-the-default-response-character-encoding-must-be-ISO-8859-1--tp22562986p22562986.html
>>> Sent from the java.net - glassfish users mailing list archive at
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
>>> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>>>
>>>
>>>
>>
>> --
>>
>> Please help to test this application:
>> http://fgaucho.dyndns.org:8080/cejug-classifieds-richfaces
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
>> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>>
>>
>>
>>
>
>