Options for reporting built-in and application defined algorithms

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Mon, 21 Feb 2005 15:47:18 +0100


There are quite a few options for the reporting of data for built-in and
application defined encoding algorithms.

Text content
       Convert to characters
       Array of primitive type
       Object of array of primitive type
       Raw encoded octets

       Convert to characters from EncodingAlgorithm.convertToCharacters
       Object returned from registed encoding algorithm using
       EncodingAlgorithm.decode method
       Raw encoded octets

Attribute value
       Convert to characters
       Object of primitive type or
       raw encoded octets

       Convert to characters from EncodingAlgorithm.convertToCharacters
       Object returned from registed encoding algorithm using
       EncodingAlgorithm.decode method or
       raw encoded octets

Currently the choice for the SAX parser impl is reduced by the following:

- do not report characters for application defined algorithms

- primtive types are never reported as raw encoded octets

- registering of handlers specifies precedence of reporting for text
   content and attribute values.

then we need to add:

- application-defined data reported as raw encoded octets unless
encoding algorithm is registered.

that should cover most use-cases and we can tweak as required for
additional edge cases with further properties.


| ? + ? = To question
    Paul Sandoz