Re: Tasks

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Thu, 24 Mar 2005 12:52:29 +0100

Hi,

There is still a fair bit to do for a complete 1.0 release. Below are
the core tasks we need to complete.

Ah there is a couple more testing and building tasks now that the
encoding is complete:

- Create regular weekly builds builds of FI.

- Commit a set of XML documents and equivalent fast infoset documents.
These documents will be used for testing round-trip consistency as well
as encoding consistency. Now that the encoding is stable we can commit
such fast infoset documents and they should be a useful resource to
other implementors. I am within two minds whether it is necessary to
canonically order the namespace attributes and attributes from parsing
the XML documents before encoding them in the fast infoset documents.
This would shield us from any changes to the implementation of Xerces
resulting in a different order for the namespacePrefix events and the
order of attributes returned by org.xml.sax.Attributes.

I need help!

Alan, would you be able to implement the remaining built-in encoding
algorithms and ensure that characters of the built-in encoding
algorithms can be returned when reporting of binary data is not enabled
by the client?

I can tackle the rest of the implementation tasks.

Joe, can you collect together a set of XML documents. We should include
all the documents from the XBC test corpus + we need XML documents
containing characters in Hebrew, Arabic, Japanese, Korean etc. I think
the well-formed files in the XML Conformance Test Suites [1] is worth a
look because they will contain edge cases we may have missed, they also
contain Japanese documents.

It is probably best to provide a zip file with all XML documents and all
FI documents that pass the round trip tests for the XML documents,
rather than individually commiting each file.

Paul.

[1] http://www.w3.org/XML/Test/

Paul Sandoz wrote:
> Tasks
> -----
>
> - Use of copy Xerces classes for checking XML character conformance.
> If there is a dependency on Xerces already when FI is used with an
> application then we can provide a special build process to reuse the
> Xerces class already present.
>
> - In-scope checking of qualified names and duplicate namespace
> attributes and attributes.
>
> - UTF-16 encoding for SAX serializing and UTF-16 decoding for all
> parsers.
>
> - Restricted alphabet encoding for SAX serializing and decoding for all
> parsers.
>
> - Built-in encoding algorithm support for StAX and DOM parsers
> (returning characters).
>
> - Implement remaining built-in encoding algorithms.
>
> - CDATA support. This is related to support for the CDATA built-in
> encoding algorithm support but requires some additional infrastructure
> for the SAX serializer.
>
> - I18N. Need to add localization support for strings associated with
> Exceptions.
>
> - Get performance results for using some built-in algorithms and
> restricted alphabets (this is required for XTech and JavaOne papers).
> We should publish such results on Java.Net and provide a link to them
> in the papers.
>
>
> Nice to have tasks
> ------------------
>
> - Implement vocabulary API
>

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109