dev@fi.java.net

Re: Tables and Encoding Algos

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Mon, 28 Feb 2005 19:11:39 +0100

Alan Hudson wrote:
> Paul Sandoz wrote:
>
>> 'Add to table' is currently only supported for strings and not for
>> encoding algorithm data.
>>
>> When adding to the table it is necessary to determine whether a String
>> has already been added. The most efficient general way is to use the
>> String in the native form. Encoding to UTF-8 and then checking the
>> bytes would be slower as this would have to be performed for each
>> string to ascertain whether it is already indexed.
>>
> Let me make sure I understand. FI allows items added to a table to be
> sent encoded instead of as text, but the current implementation is only
> sending it as text? If so, that's good.
>

Yes.


> Perhaps maintain a quick lookup which string, but send the encoded pattern.
>

Yes, that would work. To do this properly we need to modify the array
for the parser to hold strings or other objects.


>> Mixing generic approaches and specific approaches for adding stuff to
>> the table is tricky. Maybe the application needs more control over this?
>>
>>
>> For encoding algorithm data this is more tricky, it depends on whether
>> one is converting from an object or already has the sequence of
>> encoded octets. Seems like for efficiency the 'add to table' checking
>> for encoding algorithm data would need to be used. Then there is the
>> issue of managing more than one type of data per table.
>>
> true... I'm still working through a few bugs. Let's table this for a
> moment and I'll come back to it when I get around to parsing speed time
> trials.
>

OK.

Paul.

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109