Re: Bugs in Decoder.java

From: Andrzej Gladkowski <agladkowski_at_gmail.com>
Date: Mon, 2 Nov 2009 13:16:19 +0000

> Here is the list of things we found so far:
>>
>> ---------------------------------------------
>> Decoder.java
>>
>> Wrong:
>> ...
>> private void decodeTableItems(StringArray array) throws
>> FastInfosetException, IOException {
>> for (int i = 0; i < decodeNumberOfItemsOfSequence(); i++) {
>> array.add(decodeNonEmptyOctetStringOnSecondBitAsUtf8String());
>> }
>> }
>> ...
>> Correct: (See point C.2.5.2 in
>> http://www.itu.int/rec/T-REC-X.891-200505-I)
>>
>> private void decodeTableItems(StringArray array) throws
>> FastInfosetException, IOException {
>> int noOfNamespaces = decodeNumberOfItemsOfSequence();
>> for (int i = 0; i < noOfNamespaces; i++) {
>> array.add(decodeNonEmptyOctetStringOnSecondBitAsUtf8String());
>> }
>> }
>>
> We believe that we found a number of bugs in the Decoder implementation
> testing it with fastinfoset-test.fi (see attachment).
> Please correct me, if I'm wrong, in C.2.5.2 I see "If the optional
> component external-vocabulary of initial-vocabulary is present, then the bit
> '0' (padding) is appended to the bit stream and the component is encoded as
> described in C.22.".
> Fix you propose is good, but as I understand it's about getting
> noOfNamespaces just once. Not sure I understand direct relationship to
> C.2.5.2.
>
> Yes, my fix is all about getting noOfNamespaces only once.
The same bug is present in other overloaded versions of
decodeTableItems(...) methods.

>
> Wrong:
>> ...
>> private int decodeNumberOfItemsOfSequence() throws IOException {
>> final int b = read();
>> if (b < 128) {
>> return b;
>> } else {
>> return ((b & 0x0F) << 16) | (read() << 8) | read();
>> }
>> }
>> ...
>> Correct -> (See point C.2.1 in
>> http://www.itu.int/rec/T-REC-X.891-200505-I)
>> ...
>> private int decodeNumberOfItemsOfSequence() throws IOException {
>> final int b = read();
>> if (b < 128) {
>> return b + 1;
>> } else {
>> return (((b & 0x0F) << 16) | (read() << 8) | read()) + 129;
>> }
>> }
>>
> Andrzej, can you pls. elaborate, which decoding alg. you refer?
>
>
C.21 Encoding of the length of a sequence-of type
C.21.1 This subclause is invoked to encode the length of a sequence-of type
that is encoded with a length field
preceding the items of the sequence-of type.
NOTE – This encoding always starts on the first bit of an octet and ends on
the eighth bit of the same or another octet.
C.21.2 If the value is in the range 1 to 128, then the bit '0' is appended
to the bit stream, and the value, minus the
lower bound of the range, is encoded as an unsigned integer in a field of
seven bits and appended.

# Decoding lenght of sequence in the range between 1 and 128 we should do
the following:
# example value in the stream x = 50(read from 7 bits taken from a one call
to read())
# should be decoded as 50 + 1(the lower bound of the range) = 51

C.21.3 If the value is in the range 129 to 2 power 20, the bit '1' and the
three bits '000' (padding) are appended to the bit
stream, and the value, minus the lower bound of the range, is encoded as an
unsigned integer in a field of twenty bits
and appended.

# Decoding lenght of sequence in the range between 129 and 2 power 20 we
should do the following:
# example value in the stream x = 200(read from twenty bits taken from the
three calls to read())
# should be decoded as 200 + 129(lower bound of the range) = 329

# Fixing those bugs in Decoder.java means that we should also fix the
Encoder.java.