users@glassfish.java.net

Re: character encoding

From: Stijn de Witt <StijnDeWitt_at_chello.nl>
Date: Tue, 4 May 2010 18:46:47 +0200

Hi,

Do you have a specific reason to use ISO-8859-X?

If you want to support multiple languages in your web application, I highly
recommend you switch to Unicode and the charset mostly used with it for
files: UTF-8. Why, even if you don't intend to support many languages, I
*still* recommend Unicode. It is *the* standard character encoding for the
future and is already working and supported on all major platforms. Java
supports it very good.

ASCII and ANSI/ISO are dead! Long live unicode!

If you switch to Unicode, you'll be out of the ISO hell. Unicode can appear
daunting at first, but in reality it is a lot easier than trying to deal
with all the different codepages for all the different alphabets in use
today. Especially for languages that contain large alphabets, such as
Japanese, Chinese, Arabic etc, old 8-bit based standards are a real
nightmare.

Furthermore, Unicode and Java are a very good combination, because
internally Java uses Unicode. As of Java 5, all String and CharSequences are
internally UTF-16 encoded Unicode. If you just refrain from doing
character-manipulation in your applications and use String and CharSequence
instead, you need to know nothing about the internals and can deal with any
character, from any alphabet, on any page of your application, without
hassle.

To get a further understanding on Unicode I recommend you read
JoelOnSoftware's article:

The Absolute Minimum Every Software Developer Absolutely, Positively Must
Know About Unicode and Character Sets (No Excuses!)
http://www.joelonsoftware.com/articles/Unicode.html

Basically, if you want to switch to Unicode, you have to:
 * Make sure you save all your files as UTF-8 (all major text editors
support this)
 * Make sure you send as Content-Type header: "text/html; charset=utf-8"
 * Make sure any Content-Type meta tags etc name UTF-8
 * Make sure your XML files don't name ISO charsets in the prologue (UTF-8
is default for XML).

And that's about it. Save your russian properties file as UTF-8 and no
problems reading it in Java, guaranteed.

-Stijn



----- Original Message -----
From: <glassfish_at_javadesktop.org>
To: <users_at_glassfish.dev.java.net>
Sent: Tuesday, May 04, 2010 4:33 AM
Subject: character encoding


>I have a small problem with character encoding in GlassFish!
>
> I have two languages already for my web page, English and Swedish which
> both use iso-8859-1. Now i am trying to put a third language in, Russian
> which use iso-8859-5. GlassFish cannot save this WebTexts_ru.properties
> file because of this character encoding. If i right click on
> WebTexts_ru.properties and open properties i can change the character
> encoding here but then i get a conflict message since my contentType is
> set to charset iso-8859-1, please see below here for more info on this.
>
> <%_at_page import="com.neptunediving.*"%>
> <%_at_include file="WEB-INF/include/LangSupport.jsp"%>
> <%_at_page contentType="text/html; charset=ISO-8859-1" language="java"%>
>
> So now my question is this? How do i get GlassFish to be able to run on
> both iso-8859-1 and iso-8859-5 or is this not possible? Or is there
> another solution to this problem i can do instead?
> [Message sent by forum member 'torleif67']
>
> http://forums.java.net/jive/thread.jspa?messageID=400255
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>