dev@fi.java.net

Re: Restricted alphabet support

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Thu, 14 Apr 2005 19:26:27 +0200

Alan Hudson wrote:
> Paul Sandoz wrote:
>
>
>>Hi,
>>
>>I have added restricted alphabet support. All parsers support decoding
>>and the SAX serializer supports encoding for text content.
>>
>>
>>
>
> I'm forgetting the spec on this one. Is there a filesize savings to
> defining a restricted alphabet? Ie if I say I only have the numbers
> 0-9, does it save each entry in 4 bits?
>

Yes.

There are built-in numeric and date-time restricted alphabets that
contains 15 characters.

A character string containing only characters in the alphabet can be
encoded in 4 bits per character.

I have tried to make the 4 bpc implementation for the built-in alphabets
very efficient. The one for the application defined alphabets needs some
improvement but is functional.

Paul.

-- 
| ? + ? = To question
----------------\
   Paul Sandoz
        x38109
+33-4-76188109