users@jaxb.java.net

Re: UTF-8 charcter in tag name

From: Wolfgang Laun <wolfgang.laun_at_gmail.com>
Date: Wed, 11 Nov 2009 17:13:14 +0100

Yes, and it's not too difficult. You only have to replace the SAX
parser for unmarshalling,
using a newer version of Xerces2.

Get xercesImpl-2.9.1.jar.

Use this code to unmarshal:

import org.xml.sax.InputSource;
import org.apache.xerces.parsers.SAXParser;
import javax.xml.transform.sax.SAXSource;

InputSource inpsrc = new InputSource( "some.xml" );
SAXParser parser = new SAXParser();
SAXSource theSource = new SAXSource( parser, inpsrc );
Object obj = m.unmarshal( theSource );

-W




On Wed, Nov 11, 2009 at 4:07 PM, Noxi <noxilim2_at_web.de> wrote:
>
> Hi,
>
> both the shema and the xml file contain a tag name with a two byte UTF-8
> character like the 'ö' in
>
>
> <?xml version="1.0" encoding="UTF-8"?>
> ....
> <Staatsangehörigkeit>
> ...
> </Staatsangehörigkeit>
>
>
> 'ö' is a correct 2 byte utf-8 character, hex c3 b6.
>
> JAXB means it is not well formed, it cuts the tag:
>
> Message: Element type "StaatsangehÃ" must be followed by either attribute
> specifications, ">" or "/>"
>
> Other xerces based parsers work fine with this 'ö'.
> Any chance to get JAXB to accept this?
>
> Thanks
> Noxi
>
>
>
> The xsd schema file is like follows, the 'ö' is UTF8 2 byte like above.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
> elementFormDefault="qualified" attributeFormDefault="unqualified"
> version="2.5.0">
>                        <xs:element name="Staatsangehörigkeit"
> type="Staatsangehoerigkeit">
>                                <xs:annotation>
>                                        <xs:documentation>Aus
> XMeld</xs:documentation>
>                                </xs:annotation>
>                        </xs:element>
>
> ....
>
>        <xs:complexType name="Staatsangehoerigkeit">
>                <xs:sequence>
>                        <xs:element name="M3_21">
>                                <xs:annotation>
>                                        <xs:documentation>3.21
> Staatsangehoerigkeit</xs:documentation>
>                                </xs:annotation>
>                                <xs:complexType>
>                                        <xs:simpleContent>
>                                                <xs:extension
> base="Schluessel_3">
>                                                        <xs:attribute
> ref="KnZ" use="required" fixed="3.21"/>
>                                                </xs:extension>
>                                        </xs:simpleContent>
>                                </xs:complexType>
>                        </xs:element>
>                </xs:sequence>
>        </xs:complexType>
>
> --
> View this message in context: http://old.nabble.com/UTF-8-charcter-in-tag-name-tp26302584p26302584.html
> Sent from the java.net - jaxb users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jaxb.dev.java.net
> For additional commands, e-mail: users-help_at_jaxb.dev.java.net
>
>