Full XML compliance

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Wed, 27 Apr 2005 16:00:17 +0200

Hi,

The last commit (about an hour ago, for some reason not all commits get
sent to the CVS alias) completed support for full XML compliance of the
FI parsers. This includes:

- XML character validation
- XML namespace checking
- Duplicate namespace attribute checking
- In-scope validation of EIIs and AIIs
- Duplicate attribute checking

All infoset-related binary XML implementations that i am aware of do not
do this and this is one of the major criticisms i hear about binary XML
solutions.

With all these features added the performance has dropped but i have
tried to counter-act this with additional performance enhancements
(reducing method calls and field access). Performance results show that
i have made up ground but it is probably about 5%-15% slower.

We can also add properties to switch off some forms of checking
(duplicate namespace attributes and in-scope validation, and duplicate
attributes) if people want more speed (i think some XML implementations
may also sacrifice full XML conformance for speed). Duplicate attribute
checking appears to be the most expensive feature (the implementation is
very efficient).

I found that the last 10% to 15% of the work was the most difficult and
it does complicate the implementation.

The run-time memory characteristics of the parser will be higher but
should remain fairly static in terms of memory use for non-infoset
related information (as was the case before these feature were added). I
have used the indexing feature to our advantage for efficient in-scope
and duplicate attributes checking.

Paul.

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109