jsr344-experts@javaserverfaces-spec-public.java.net

[jsr344-experts] facelets-processing jspx/xml CDATA handling

From: Andy Schwartz <andy.schwartz_at_oracle.com>
Date: Fri, 06 Jan 2012 10:52:35 -0500

Gang -

I have finally reached the point where I can cut some of my JSF
2.0/JSP-based applications over to JSF 2.1/Facelets. Unfortunately,
while doing this I have run into a problem with how CDATA sections are
handled when the facelets processing mode is set to "jspx" (or "xml").
Since the problem shows up in both Mojarra and MyFaces and might require
a spec clarification, figured it would be best to raise this issue here.

For folks who weren't on the EG/observers lists during the 2.1
timeframe, background on this topic can be found in the jsr-314-open
archives. The "[jsf2next] might as well face it, Facelets is XML" thread
[1] provides a good overview of the requirements.

The particular case that I am having problems with is described in the
same thread [2]. Long but hopefully relevant snippet:


> This approach would be perfect for my use case. One of the reasons why
> I am interested in pushing this XML view of the world is because I am
> already using a technology that supports this: JSP documents - ie.
> XML-based JSP pages. (Funny, yes, there is something about JSP that I
> actually like more than Facelets!) In our case we have a tremendous
> amount of XML-based JSF content sitting around in .jspx files. (We also
> use the extension .jsff for "JSF fragments" - ie. files that specify
> component subtrees rather than complete pages.) For the most part we
> can simply run these files via the Facelets engine. (Yet another reason
> why I love Facelets.) In JSP document land, XML instructions (the XML
> declaration, CDATA sections) are treated as instructions to the JSP
> engine itself.
>
> This means that I can have the following .jspx file:
>
> <?xml version='1.0' encoding='utf-8'?>
> <f:view>
> <tr:document>
> <trh:script>
> <![CDATA[
> if (0 < 1) alert("w00t!");
> ]]>
> </trh:script>
> </tr:document>
> </f:view>
>
>
> And not have to worry about the XML declaration or CDATA section wrapper
> being sent to the browser. Instead, my render kit can decide to render
> an XML declaration or not (depending on whether we are generating XHTML
> or HTML), and the trh:script renderer can decide how to deal with its
> body. (The HTML <script> element is implicitly CDATA, so the script
> content can be passed directly through.)
>
> As we've been discussing, Facelets will pass this XML-specific content
> directly on to the browser, which is invalid in the case where we are
> rendering traditional HTML.


Taking a closer look at the code sample:

> <trh:script>
> <![CDATA[
> if (0 < 1) alert("w00t!");
> ]]>
> </trh:script>


The CDATA section is used to prevent the JSP engine's XML parser from
choking on the '<' character in the "if" expression - ie. we don't want
the XML parser to treat the script as parsed character data, but rather
to suck in the entire block of text and resume parsing once the CDATA
closing construct ("]]>") is encountered.

This works just fine in JSP documents, which produces the following
rendered output:


> <script type="text/javascript">
> if (0 < 1) alert("w00t!");
> </script>


But fails under the legacy Facelets/xhtml processing behavior since this
mode assumes that the output document is also xhtml and passes the CDATA
section wrappers through, resulting in:

> <script type="text/javascript">
> <![CDATA[
> if (0 < 1) alert("w00t!");
> ]]>
> </script>


If the output document is text/html rather than xhtml, the browser will
choke on the CDATA section (which is not valid in text/html documents).

Fortunately, JSF 2.1 provides a solution. Enabling jspx-compatible
processing, eg:


> <facelets-processing>
> <file-extension>.jspx</file-extension>
> <process-as>jspx</process-as>
> </facelets-processing>


Should provide the desired behavior - ie. should prevent the CDATA
section wrappers from being passed along to the browser. As such, we
should see:


> <script type="text/javascript">
> if (0 < 1) alert("w00t!");
> </script>


Instead of:

> <script type="text/javascript">
> <![CDATA[
> if (0 < 1) alert("w00t!");
> ]]>
> </script>


In the rendered document.

Unfortunately, instead of consuming the CDATA section start/end
wrappers, both Mojarra and MyFaces consume the entire CDATA block,
resulting in the following rendered content:


> <script type="text/javascript">
> </script>


This, of course, is not the desired behavior. We've gone from one
failure (invalid CDATA section wrappers) to a different failure (missing
script content). Either way, we've got a non-functional page.

So… what to do about this?

My take is that the current behavior is sufficiently broken that both
Mojarra and MyFaces should fix this as soon as possible (ie. in a 2.1.x
release). To avoid further confusion, we should add language to the spec
that clarifies how CDATA sections are handled (for the 2.2 spec).

I'll log implementation + spec issues for this, but wanted to raise this
here first in case folks have comments/questions.

Thoughts?

Andy

[1]
http://lists.jboss.org/pipermail/jsr-314-open-mirror/2009-December/thread.html#1930
[2]
http://lists.jboss.org/pipermail/jsr-314-open-mirror/2009-December/001999.html