users@jersey.java.net

Re: [Jersey] how to marshall an xml text field

From: Paul Sandoz <Paul.Sandoz_at_oracle.com>
Date: Mon, 30 Aug 2010 11:19:40 +0200

On Aug 27, 2010, at 8:37 PM, Tatu Saloranta wrote:

> On Fri, Aug 27, 2010 at 11:17 AM, John Calcote
> <john.calcote_at_gmail.com> wrote:
>> This is more a JAXB question than a Jersey question, but I was
>> hoping
>> someone on this list might be willing to impart hard-earned
>> knowledge...
>>
>> I have a situation where I want to send an xml document as a field
>> in a
>> JAXB message. I don't want JAXB to know anything about the contents
>> of
>> this field - specifically, I don't want JAXB to marshall this field
>> to
>> and from XML. I just want to treat it as raw XML. The reasons for
>> this
>> have to do with the layers in our code that interpret the contents of
>> this field. I'm using JAXB at the lowest layer, but the field
>> contents
>> are injected at one end from a much higher layer, and consumed from
>> that
>> same higher layer at the end of the transmission. Thus, from the
>> message
>> transport's perspective, I want it to appear to just be raw text.
>>
>> I've spoken with colleagues about this problem. One says you should
>> use
>> byte array to keep JAXB from messing with the encoding of the
>> embedded
>> xml text. Another says String should would fine.
>
> It sort of depends -- serializing things as Strings is fine in that it
> gets back and forth ok (well, assuming it does not have characters
> that are invalid in XML; which is true for regular text, and false for
> binary stuff). It does get escaped with regular character entity
> replacement to get rid of unquoted less-thans and ampersands.
>

> If so, you should be fine. Otherwise byte[] should work as long as
> JAXB implementation knows to use base64; I think there was some
> annotation to help with it, but can't remember what that would be.
>

IIRC JAXB will automatically use base64 encoding for byte[].

One, perhaps obvious thing to point out, is make sure the characters
of the XML document are converted to an appropriate character encoding
agreed on both sides e.g. UTF-8.

Paul.