users@jaxb.java.net

Re: JAXB escaping apostrophe (Single Quote)

From: Wolfgang Laun <wolfgang.laun_at_gmail.com>
Date: Tue, 6 Apr 2010 15:20:41 +0200

On Tue, Apr 6, 2010 at 2:45 PM, Ely Schoenfeld <ely.sun.1_at_mitalteli.com> wrote:
> Well... yes and no.
>
> Either way. As long as I can tell, the specification encourages the
> translation of single quotes to "& apos;" (with no space), in certain
> circumstances.

Basically, yes. The intent is to give programs and humans a free hand
for composing XML documents.

>
> <quote>
> To allow attribute values to contain both single and double quotes, the
> apostrophe or single-quote character (') may be represented as " & apos;
> ", and the double-quote character (") as " & quot; ".
> </quote>
> Reference: http://www.w3.org/TR/REC-xml/#syntax
>
> Based on this. If JAXB should comply to the W3C rules, it should be
> possible to "translate" the single quote to "& apos;". Don't you think?

No. The quoted paragraph simply says that you can use entities for
both, apostrophe and quote, which is obviously necessary for the
one you use as a surrounding delimiter. If a program serializing XML
content sticks to the quote as an attribute value delimiter, it doesn't
have to use &apos; There is no point in using the entity for apostrophe
when there is no conflict with XML interpunctuation.

-W

>
> Thank you for all your help.
>
> Ely.
>
> Wolfgang Laun escribió:
>>
>> I've read through the comprobantes fiscales, and if I'm guessing the
>> Spanish
>> correctly,
>> the interesting paragraph is the one preceding what you quoted:
>>
>> Adicionalmente a las reglas de estructura planteadas dentro del presente
>> estándar, el contribuyente que opte por este mecanismo de generación de
>> comprobantes deberá sujetarse tanto a las disposiciones fiscales vigentes,
>> como a los lineamientos técnicos  de forma y sintaxis para la generación
>> de
>> archivos XML especificados por el consorcio w3, establecidos en
>> www.w3.org.
>>
>> Doesn't this say that you have to follow the rules for XML as defined by
>> W3C?
>>
>> It would appear that the subsequent paragraphs have been inserted in an
>> attempt to guide those who think of hand-crafting their XML output. It
>> cannot seriously be meant to supersede the XML definition.
>>
>> Also, what I can understand of the last paragraph
>> ("Adicionalmente,...SAT.")
>> seems to support my assumption. Anybody who knows how XML representation
>> works wouldn't need any of this.
>>
>> Best
>> Wolfgang
>>
>> PS: If you need a supporting statement from someone with W3C, I know just
>> the right guy.
>>
>> On Mon, Apr 5, 2010 at 11:26 PM, Ely Schoenfeld
>> <ely.sun.1_at_mitalteli.com>wrote:
>>
>>> Hello All.
>>>
>>> As a suggestion from laune at dev.java.net I'm posting here my problem
>>> with jaxb character escapes.
>>>
>>> I really need help "translating" the single quote character to "&apos;".
>>> But if I define this in a CharacterEscapeHandler, I get "&amp;apos;"
>>> instead.
>>>
>>> I opened the issue number 741 called "Characters get escaped twice with
>>> Custom CharacterEscapeHandler and encoding=UTF-8" about this problem I
>>> have.
>>>
>>> It can be found at: https://jaxb.dev.java.net/issues/show_bug.cgi?id=741
>>>
>>> Any help you can provide will be really (REALLY) appreciated.
>>>
>>>
>>> The reason I need to "translate" the single quote to "&apos;" is because
>>> I'm required to by a government agency in Mexico. In case you understand
>>> Spanish, the specification appears in page number 6 from:
>>>
>>>
>>> ftp://ftp2.sat.gob.mx/asistencia_servicio_ftp/publicaciones/cfd/Anex20_v20.pdf
>>>
>>>
>>> http://www.sat.gob.mx/sitio_internet/e_sat/comprobantes_fiscales/15_6534.html
>>>
>>> ----------------     BEGIN      ----------------
>>> En particular se deberá tener cuidado de que aquellos casos especiales
>>> que se presenten en los valores especificados dentro de los atributos
>>> del archivo XML como aquellos que usan el caracter & , el caracter " ,
>>> el caracter ' , el caracter < y el caracter > que requieren del uso de
>>> secuencias de escape.
>>>
>>> - En el caso del & se deberá usar la secuencia &amp;
>>> - En el caso del " se deberá usar la secuencia &quot;
>>> - En el caso del < se deberá usar la secuencia &lt;
>>> - En el caso del > se deberá usar la secuencia &gt;
>>> - En el caso del ' se deberá usar la secuencia &apos;
>>>
>>> Ejemplos:
>>> Para representar nombre="Juan & José & "Niño"" se usará nombre="Juan
>>> &amp; José &amp; &quot;Niño&quot;"
>>>
>>> Adicionalmente, cabe mencionar de que a pesar de que la especificación
>>> XML permite el uso de secuencias de escape para el manejo de caracteres
>>> acentuados y el carácter ñ, dichas secuencias de escape no son
>>> necesarias al expresar el documento XML bajo el estándar de codificación
>>> UTF-8 si fue creado correctamente, misma que es utilizada como único
>>> estándar por el SAT.
>>> ----------------      END       ----------------
>>>
>>>
>>> Thank you very much in advance.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_jaxb.dev.java.net
>>> For additional commands, e-mail: users-help_at_jaxb.dev.java.net
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jaxb.dev.java.net
> For additional commands, e-mail: users-help_at_jaxb.dev.java.net
>
>