users@glassfish.java.net

Re: character encoding

From: Stijn de Witt <StijnDeWitt_at_chello.nl>
Date: Wed, 5 May 2010 12:50:00 +0200

I don't really think Glassfish has anything to do with it actually.

The first thing to set up is your default encoding in your editor/IDE, so
any files you create will be created in Unicode/UTF-8. Since you are talking
about "Inherited from container", I am assuming you are using Eclipse.

The container for a file in Eclipse is your project, and the container for
your project is your workspace.

To change the workspace setting to UTF-8 in Eclipse, do this:

<main menu> -> Window -> Preferences

In the preferences dialog, change the encoding for everything in sight to
UTF-8. Easiest is to type 'encoding' in the filter box to get all related
settings. They are at:

General -> Content Types

This has a list of file types, click through them and make sure that the
encoding for them is either not set, or set to UTF-8.

General -> Workspace

Set Text file encoding to Other: UTF-8.

Unfortunately the Eclipse people insist on using Default (platform specific)
here. There have been large debates about this on their mailing lists but
some of the key people seem to think that defaulting to Windows/Unix/Mac
specific encodings here is wiser than just setting it to UTF-8 for the whole
world. A huge mistake imho, but they are not easily convinced. Ofcourse
everyone *can* change this, but most people are unaware of this setting, so
in practice most projects have their default encoding set to
Windows-specific cp1252... (or linux/mac specific settings). Very bad imho,
but there it is. When I install Eclipse or setup a new workspace, this is
the first thing I change. You should try to make a habit out of that to.
Best to go for Unicode/UTF-8 from the beginning so you don't forget about it
and get hurt by it later.

Web -> CSS Files, HTML Files, JSP Files and XML -> XML Files

Set encoding to ISO 10646/Unicode(UTF-8)

A bit of a confusing name, but UTF-8 is actually also an ISO standard.

When this is done, check your project settings.

Select project, then <main menu> -> Project -> Properties

If you will be sharing your project with others, you run the risk of them
having forgotten to set their workspace settings correct and introducing
wronly encoded files into your project when they create new files. So to be
safe, it might be smart to set your project settings to "Other: UTF-8"
instead of "Inherited from container", because these project settings will
be shared along with the project and the workspace settings are not.

If you have existing files already, it might be necessary to open them and
save them again. You can check the individual encoding of files by
right-clicking them (again I am assuming Eclipse) and selecting Properties
from the context menu. There in section Resource you can set / check their
encoding. It's ok for individual files to have it set to "Inherit from
container".

Last, but not least, check any XML, HTML and JSP files in your project.

XML files should either have no prolog, no encoding in their prolog, or this
prolog:

<?xml version="1.0" encoding="UTF-8"?>

HTML files should either not define a meta tag for Content-Type, or have
this one:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

JSP files should start with this:

<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>

to instruct the server to emit the correct Content-Type HTTP header.

Pfew! That was a lot of work. But it's worth it!

I have found that the hardest thing about Unicode and UTF-8 in Java is not
actually Unicode/UTF-8 itself, but preventing all the nasty legacy encodings
from creeping into your project and breaking it in unexpected places.

Good luck with it!

-Stijn

----- Original Message -----
From: <glassfish_at_javadesktop.org>
To: <users_at_glassfish.dev.java.net>
Sent: Wednesday, May 05, 2010 2:32 AM
Subject: Re: character encoding


> Hallo!
>
> No not at all actually. I have used this since the days from Dreamweaver i
> think and it is only now i have seen that i need to change this. And i
> think UTF-8 is the way to go here....
>
> Unicode and Java seems to work perfect together as well so this is what i
> am looking for. I am using tags when i am inserting text from a properties
> file into a JSP page, see below here for more info;
>
> home.header1=Welcome to Neptune Diving Adventure
> is in the text file and
> <fmt:message key="home.header1" />
> is in the jsp file.
>
> This is the small problem i have is how to save all these files in UTF-8
> instead ISO in GlassFish? Most of it is already set to ISO-8859-1 or
> MacRoman, Inherited from container..
>
> So i wonder if i have to change somewhere in the program, GlassFish or if
> it is ok to only change in the source code? Everything have to go from
> iso-8859-1 to UTF-8 instead.
> [Message sent by forum member 'torleif67']
>
> http://forums.java.net/jive/thread.jspa?messageID=403537
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>