users@jaxb.java.net

Re: JAXB2 2.0.3 UTF-8 support

From: James Mao <james.mao_at_iona.com>
Date: Thu, 08 Feb 2007 11:14:48 +0800

Hi,

> If this is a runtime problem (vs. build problem) , make sure that you
> are using a stream instead of a reader.
> Make sure that your XML-declaration correctly defines the character set.
> Try opening the XML file in IE or something that will identify invalid
> characters.
The IE, or Firefox both says it's a UTF-8 encoding file
The xml file also declare it's UTF-8 encoding.

Actually it's a junit test code, which test if the UTF-8 xml
file(include Chinese chars) can be parsed by xml parser.
The test code itself can not be simpler:

parserFactory = DocumentBuilderFactory.newInstance();
parserFactory.setNamespaceAware(true);
parserFactory.newDocumentBuilder().parse(inputStream);

The code in ANT works fine, but failed in Maven

If i remove all the Chinese character, (still use the same encoding) the
code in Maven works also fine.

So i suspect, it's Maven loaded wrong xml parser. But as i said before,
i'm more 100% sure about this.

And my project has been moved to ANT, it's just simple to get things
done. Maven is too....

Cheers,
James.

>
> On 2/3/07, *James Mao* <james.mao_at_iona.com
> <mailto:james.mao_at_iona.com>> wrote:
>
> Hi,
>
> I encounter a very strange problem with jaxb2 UTF-8 support,
> I'm using JAXB 2.0.3, jdk 1.5.0_10-b03, the xml is in utf-8 encoding.
>
> The problem is:
>
> if the node or attribute contains *three* Chinese characters, the
> unmarshall will fail with the exception message:
>
> [org.xml.sax.SAXParseException: Invalid byte 3 of 3-byte UTF-8
> sequence.]
> at
> javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException
> (AbstractUnmarshallerImpl.java:315)
> at
> com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:476)
>
> at
> com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0
> (UnmarshallerImpl.java:198)
> at
> com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:167)
> at
> javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java
> :137)
> at
> javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184)
>
> If add another Chinese character (*four* Chinese characters), the
> unmarshall works just fine.
>
> I don't know if it's a known issue in jaxb2? or maybe it's a bug
> in the
> xml parser used by Jaxb2?
>
> Trying to upgrade to JAXB2 2.1.2, but seems the version is not
> uploaded
> to the maven2 repository yet, any plan to upload the latest version?
>
>
> Thanks,
> James.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jaxb.dev.java.net
> <mailto:users-unsubscribe_at_jaxb.dev.java.net>
> For additional commands, e-mail: users-help_at_jaxb.dev.java.net
> <mailto:users-help_at_jaxb.dev.java.net>
>
>