Re: Tables and Encoding Algos

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Mon, 28 Feb 2005 09:14:29 +0100

Alan Hudson wrote:
> Looking at the startElement call on SAXDocumentSerializer.java, I notice
> that if a value is added to the table its unencoded form is used.
> Wouldn't it save space to use the algorithm encoded form? I found it
> useful to have attributeValueSizeConstraint sizes up to 32 for X3D
> files. So encoding these strings helped a fair bit.
>
> The code:
> value = eAtts.getValue(i);
> if (value != null) {
> addToTable = (value.length() <
> _v.attributeValueSizeConstraint) ? true : false;
> encodeNonIdentifyingStringOnFirstBit(value,
> _v.attributeValue, addToTable);
> } else {
>
> encodeNonIdentifyingStringOnFirstBit(eAtts.getAlgorithmURI(i),
> eAtts.getAlgorithmIndex(i),
> eAtts.getAlgorithmData(i));
> }
>

'Add to table' is currently only supported for strings and not for
encoding algorithm data.

When adding to the table it is necessary to determine whether a String
has already been added. The most efficient general way is to use the
String in the native form. Encoding to UTF-8 and then checking the bytes
would be slower as this would have to be performed for each string to
ascertain whether it is already indexed.

Mixing generic approaches and specific approaches for adding stuff to
the table is tricky. Maybe the application needs more control over this?

For encoding algorithm data this is more tricky, it depends on whether
one is converting from an object or already has the sequence of encoded
octets. Seems like for efficiency the 'add to table' checking for
encoding algorithm data would need to be used. Then there is the issue
of managing more than one type of data per table.

Paul.

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109