users@jaxb.java.net

Re: JAXB escaping apostrophe (Single Quote)

From: Ely Schoenfeld <ely.sun.1_at_mitalteli.com>
Date: Tue, 06 Apr 2010 14:52:11 -0500

First of all: Thank you two for taking the time to respond to my help
request.

Ok. Yes.

Gary Gregory escribió:
> The following are all valid XML attributes for [Joe's "BIG" Crabs]:
>
> 1) myAttr="Joe's &quot;BIG&quot; Crabs"
> 2) myAttr='Joe&apos;s "BIG" Crabs'
> 3) myAttr="Joe&apos;s &quot;BIG&quot; Crabs"
> 4) myAttr='Joe&apos;s &quot;BIG&quot; Crabs'
>
> There is no point in using 3 or 4 since you are just creating fatter XML for no reason. It is valid but wasteful.

I see your point. I do agree with the needlessness and wastefulness of
options 3 and 4.

My problem is that I don't think I'll ever be able to convince any
Mexican Government agency about that. But I can try. I will look for
someone who address a letter to, and maybe accept Wolfgang's offering
about using his W3C contact to help me writing the letter.

In the mean time: Any ideas about how to obtain an & apos; in the output
XML?

I think I will have to "escape" the single quote with "& apos;" (without
the space) what will lead to obtain "& amp;apos;" (without the space) in
the generated xml.

Then, post-process every generated xml file in order to replace (only
once) any ocurrence of "& amp;apos;" (without the space) for "& apos;"
(without the space)

Or even "escape" the single quote with "#@#_at_MyHackToGetapos@#@#" and
then replace that "very very low probability" string to "& apos;"
(without the space)

I don't like either one of those options at all.

Ely

Wolfgang Laun escribió:
> On Tue, Apr 6, 2010 at 4:57 PM, Ely Schoenfeld <ely.sun.1_at_mitalteli.com>
> wrote:
>> Wolfgang Laun escribió:
>>> On Tue, Apr 6, 2010 at 2:45 PM, Ely Schoenfeld <ely.sun.1_at_mitalteli.com>
>>> wrote:
>> Ok, "it doesn't have to" but should be possible. I'm not saying that every
>> body have to use "& apos;", but it should be doable.
>>
>
> Whatever for? As Gary has shown, it's possible to represent any string value
> as an attribute value by using one delimiter for quoting the string and the
> corresponding entity within the string for representing occurrences of that
> character as part of the string.
>
> And if this string value happens to occur as element text, any reasonable
> XML
> serializer will represent quotes as " (and NOT &quot;) and apostrophes as '
> (and NOT &apos;)
>
>> I do consider possible to have some company that has an apostrophe in its
>> name (i.e: [Somebody's Store]). Also could imagine some other company with
>> double quotes in it's name (i.e: ["SomeInventedName" invetions store]). Or
>> even both (i.e: [Sombody's "GREAT" inventions]).
>>
>> My point is that in this case at least, is not possible to stick to either
>> one of these two attribute value delimiters. As I can see, you must use
>> entities to generalize the "digital" invoice use.
>>
>> Am I right?
>>
>
> No - see Gary's mail.
>
> The fact that *any* character *may *be represented in a number of ways does
> not
> establish an argument that this or that character *must *be represented in
> some
> specific way as long as it is valid XML. In fact, *any *character *can *be
> represented
> using a numeric entity, but if your encoding happens to be UTF-8, it is
> *not necessary* to resort to this notation, and any serializer would be
> sneered at
> if it would use &#8364; (or the hexadecimal form) instead of € (the Euro
> sign); or
> use the corresponding form for any other codepoint greater 0x7F.
>
> -W
>
>> Ely.
>>
>