users@jaxb.java.net

Re: Debugable source code needed for version 2.1.8 of Jaxb

From: Jose Correia <correij_at_gmail.com>
Date: Tue, 3 Mar 2009 13:10:30 +0200

Hi Wolfgang

Well find attached two .xml files, the one was when it was loaded on windows
after being decrypted and the other on Linux, just before they are both
unmarshalled.

Curiously if I open both files with UltraEdit32, it says encoding is U8-DOS.
If I try and compare both with WinMerge it tells me that "Files use
different encodings, left = 1252 (the Windows one) and right = UTF-8 (Linux
one), and merging may lead to information loss"

and then "Information lost due to encoding errors: right file.

We then ran a unix command called cmp and the attached file shows the places
where they are different. I'm not sure which are the funny characters it
fails on but I created a simpler xml file and it worked without the need for
setting the encoding. I thought something like these on the one that fails
had something to do with it: "ENTER&lt;lf&gt; PIN&lt;lf&gt; #####"

Anyway go figure.

Regards
Jose

On Tue, Mar 3, 2009 at 11:30 AM, Wolfgang Laun <wolfgang.laun_at_gmail.com>wrote:

> Glad to hear that it works. Nevertheless, what you report is very strange.
>
> Whatever OS configuration and marshaller property setting result in - the
> XML file should be written in the encoding shown in the first line
> <?xml ... encoding="UTF-8" ... ?>
> and the very same file, or its sequence of bytes, should be readable by
> an unmarshaller on some other system.
>
> Problems may be caused if some other program that does not interpret
> <?xml...?> is used to handle the data.
>
> I would be very interested to learn
> - what was the first XML line when written on Windows
> - what, exactly, were the "funny characters" (the raw byte sequence) as
> written on Windows
>
> -W
>
>
> On Tue, Mar 3, 2009 at 10:00 AM, Jose Correia <correij_at_gmail.com> wrote:
>
>> Hi all
>>
>> I got it to work by setting explicitly on my marshaler the following
>> property to UTF-8:
>>
>> Marshaller m = jc.createMarshaller();
>> m.setProperty(Marshaller.JAXB_ENCODING, ArtifactConstants.UTF8_ENCODING);
>>
>> So when I saved my xml data on windows and then loaded it on Linux, it now
>> works. Even though the javadocs specify that if no encoding is specified
>> then it defaults to UTF-8, my chief engineer suspected it was setting a OS
>> specific encoding thus making it not work...
>>
>> I also found the problem only occured if the xml data had some funny
>> characters in it, Im guessing characters that the Linux encoding didn't
>> understand.
>>
>> Cheers
>> Jose
>>
>>
>> On Fri, Feb 27, 2009 at 4:58 PM, Jose Correia <correij_at_gmail.com> wrote:
>>
>>> Well I did check out the build environment and saw that in
>>> build.properties the debug flag is set to true.... so not sure why it can't
>>> see the lines. Not sure if having eclipse running on debug on windows
>>> connecting to the linux box has anything to do with it.
>>>
>>> Regards
>>> Jose
>>>
>>>
>>> On Fri, Feb 27, 2009 at 4:42 PM, Wolfgang Laun <wolfgang.laun_at_gmail.com>wrote:
>>>
>>>> Off-list, Jose has confirmed by surmise that the exception occurs during
>>>> unmarshalling and is, most likely, due to some mishap in connection with the
>>>> transfer of the XML file between systems.
>>>>
>>>> This doesn't clarify the no-debug-flag question for 2.1.8, though.
>>>>
>>>> -W
>>>>
>>>> On Tue, Feb 24, 2009 at 3:28 PM, Jose Correia <correij_at_gmail.com>wrote:
>>>>
>>>>> Hi all
>>>>>
>>>>> Our software is using jaxb-api.jar and jaxb-impl.jar for version 2.1.8.
>>>>> We decided to try our software on linux to see how it would fare (as opposed
>>>>> to Windows XP/2000/2003).
>>>>>
>>>>> We are trying it with Ubuntu 8.0.4 Desktop version and we are using
>>>>> Sun's jdk version 6 update 12.
>>>>>
>>>>> The line that crashes is:
>>>>>
>>>>> Unmarshaller u = jc.createUnmarshaller();
>>>>>
>>>>> where jc is: jc = JAXBContext.newInstance(JAXB_CONTEXT);
>>>>>
>>>>> Exception it gives is:
>>>>>
>>>>> javax.xml.bind.UnmarshalException
>>>>> - with linked exception:
>>>>> [com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
>>>>> Invalid byte 2 of 2-byte UTF-8 sequence.]
>>>>>
>>>>> So I was trying to debug the jaxb code, I got source from downloading
>>>>> ri 2.1.8 which sources for relevant jars but it seems classes were compiled
>>>>> without "allow debugging" set to true, because if I put it on debug on
>>>>> eclipse and I have ensured eclipse knows about the source of those jars, it
>>>>> then doesn't show me the line numbers.
>>>>>
>>>>> From past experience that tells me it wasn't compiled with that debug
>>>>> flag on.
>>>>>
>>>>> Anyway if anyone can help with exception or how to get debugabble
>>>>> classes I would appreciate it. I tried getting into the cvs source with:
>>>>>
>>>>> cvs -d:pserver:yourid_at_cvs.dev.java.net:/cvs co -d jaxb-ri
>>>>> jaxb2-sources/jaxb-ri
>>>>>
>>>>> but using my sun id (that I used to subscribe to mailing list) it came
>>>>> back with unknown id. I have applied to become a code observer within the
>>>>> https://jaxb2-sources.dev.java.net/ project.
>>>>>
>>>>> Thanks
>>>>> Jose
>>>>>
>>>>
>>>>
>>>
>>
>