users@jaxb.java.net

UTF-8 encoding broken in JAXB 1.0 marshaller

From: Bernhard Mandl <bmandl_at_ITV.GLOBALREFUND.COM>
Date: Thu, 06 Mar 2003 07:38:44 -0700

The marshaller always writes the character codes for the current ANSI codepage into the XML file.

For example the character "?" (o with two dots above it) has the code 246 decimal or 0xF6 in the ANSI codepage 1252 (Windows Latin 1)

When I marshal a string containing an "?" the XML file always contains &#214, no matter what encoding I specify with the JAXB_ENCODING property. But &#214 is only correct for encoding="Cp1252", for UTF-8 it is definitely wrong.

So it is not possible to write correct XML files in UTF-8 encoding ?!

I am using JRE 1.4.1_01 under WinXP