users@jaxb.java.net

JAXB2 2.0.3 UTF-8 support

From: James Mao <james.mao_at_iona.com>
Date: Sat, 03 Feb 2007 18:53:59 +0800

Hi,

I encounter a very strange problem with jaxb2 UTF-8 support,
I'm using JAXB 2.0.3, jdk 1.5.0_10-b03, the xml is in utf-8 encoding.

The problem is:

if the node or attribute contains *three* Chinese characters, the
unmarshall will fail with the exception message:

[org.xml.sax.SAXParseException: Invalid byte 3 of 3-byte UTF-8 sequence.]
        at
javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:315)
        at
com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:476)

        at
com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:198)
        at
com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:167)
        at
javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:137)
        at
javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184)

If add another Chinese character (*four* Chinese characters), the
unmarshall works just fine.

I don't know if it's a known issue in jaxb2? or maybe it's a bug in the
xml parser used by Jaxb2?

Trying to upgrade to JAXB2 2.1.2, but seems the version is not uploaded
to the maven2 repository yet, any plan to upload the latest version?


Thanks,
James.