Re: Bugs in Decoder.java

From: Andrzej Gladkowski <agladkowski_at_gmail.com>
Date: Wed, 4 Nov 2009 17:03:58 +0000

On Wed, Nov 4, 2009 at 3:16 PM, Arman Djusupov <arman_at_noemax.com> wrote:

> Hi Andrzej,
>
> I have fixed the decodeIntegerIndexOnSecondBit() in following manner (note
> that this is C# code):
> /*
> * C.25
> */
> protected int decodeIntegerIndexOnSecondBit()
> {
> int b = read() | 0x80;
>
> switch (DecoderStateTables.ISTRING[b])
> {
> case DecoderStateTables.ISTRING_INDEX_SMALL:
> return b & EncodingConstants.INTEGER_2ND_BIT_SMALL_MASK;
> case DecoderStateTables.ISTRING_INDEX_MEDIUM:
> return (((b &
> EncodingConstants.INTEGER_2ND_BIT_MEDIUM_MASK) << 8) | read()) +
> EncodingConstants.INTEGER_2ND_BIT_SMALL_LIMIT;
> case DecoderStateTables.ISTRING_INDEX_LARGE:
> return (((b &
> EncodingConstants.INTEGER_2ND_BIT_LARGE_MASK) << 16) | (read() << 8) |
> read()) + EncodingConstants.INTEGER_2ND_BIT_MEDIUM_LIMIT;
> case DecoderStateTables.ISTRING_SMALL_LENGTH:
> case DecoderStateTables.ISTRING_MEDIUM_LENGTH:
> case DecoderStateTables.ISTRING_LARGE_LENGTH:
> default:
> throw new
> FastInfosetException(Strings.message_decodingIndexOnSecondBit);
> }
> }
>
> So now I can successfully read the Initial Vocabulary encoded at the
> beginning of your document. But your document seems that it's ending with a
> single terminator right after the Initial Vocabulary encoding. So there is
> no root element there?
>
>

# No, there is no root element. This is totally external vocabulary,
separate file (point 7.2.13 in the specification).
# That external file is used when decoding the actual document.

> With best regards,
> Arman
>
>
> Andrzej Gladkowski wrote:
>
>> On Tue, Nov 3, 2009 at 3:26 PM, Arman Djusupov <arman_at_noemax.com> wrote:
>>
>> Hello Andrzej,
>>>
>>> It seems that the decodeNumberOfItemsOfSequence() method indeed has a
>>> problem, since it doesn't add the lower boundary of the range after
>>> reading
>>> the value.
>>>
>>> But why do you think that C.25.2 is implemented in the wrong way?
>>>
>>> As far as I can see C.25.2 implementation is correct. It doesn't add +1
>>> when reading 1-64 value range, because in the Java implementation the
>>> vocabulary tables are 0 based, so practically adding and subtracting 1
>>> while
>>> reading/writing is not necessary. The same applies to other ranges. It
>>> adds
>>> 64 instead of 65 as lower boundary for medium ranged values and 8256
>>> instead
>>> of 8257 for high ranged values.
>>> With best regards,
>>> Arman
>>>
>>>
>>> #Yes, I agree that subtracting 1 while reading/writing is not necessary.
>> # No, I think the whole C.25 is implemented correctly, it's about
>> something
>> else. Please read further comments.
>>
>> # I think confusion is caused by the following points:
>>
>> C.13.4 If the alternative string-index is present, then the bit '1'
>> (discriminant) is appended to the bit stream, and
>> the string-index is encoded as described in C.25
>> C.16.5 If the optional component prefix-string-index is present, then the
>> bit '0' (padding) is appended to the bit
>> stream, and the component is encoded as described in C.25.
>> C.16.6 If the optional component namespace-name-string-index is present,
>> then the bit '0' (padding) is appended
>> to the bit stream, and the component is encoded as described in C.25.
>> C.16.7 The bit '0' (padding) is appended to the bit stream, and the
>> component local-name-string-index is
>> encoded as described in C.25.
>>
>> # The function 'decodeNumberOfItemsOfSequence(..)' implements correctly
>> point *C.13.4*, when the octed starts with '1'
>> # but it fails if the octet starts with '0' !
>> # I have looked into Encoder.java and there are two separate encoding
>> methods:
>> - Encoder.encodeNonZeroIntegerOnSecondBitFirstBitZero(..)
>> - Encoder.encodeNonZeroIntegerOnSecondBitFirstBitOne(..)
>> # We could fix Decoder.decodeNumberOfItemsOfSequence(..) by adding another
>> method to handle octets starting with '0' or simply by ignoring the first
>> bit in the octet all the time.
>>
>> # Here is the junit test that can be used to test both scenarios (by
>> uncommenting the right FIRST_BIT constant in the code):
>> =========================================================================
>> import java.io.IOException;
>> import org.jvnet.fastinfoset.FastInfosetException;
>> import com.sun.xml.fastinfoset.Decoder;
>> import junit.framework.TestCase;
>>
>> public class DecoderTest extends TestCase {
>> /* Uncomment the right section to test first bit in the octet '1' or
>> '0'
>> */
>> //private static final byte FIRST_BIT = (byte)0x80;//1000 0000 //
>> C.13.4
>> private static final byte FIRST_BIT = (byte)0x00;//0000 0000 //
>> C.16.5-7
>>
>> private TestDecoder decoder;
>> private byte[] buffer;
>> private class TestDecoder extends Decoder {
>> public TestDecoder(byte[] buffer) {
>> this._octetBufferOffset = 0;
>> this._octetBufferEnd = 15;
>> this._octetBuffer = buffer;
>> }
>> public int decodeIntegerIndexOnSecondBitTest() throws
>> FastInfosetException, IOException {
>> return decodeIntegerIndexOnSecondBit();
>> }
>> }
>>
>> protected void setUp() throws java.lang.Exception {
>> buffer = new byte[16];
>> decoder = new TestDecoder(buffer);
>> }
>> // integer in range [1, 64], ( [0, 63] ) 6 bits
>> public void testIntegerIndex0() throws IOException,
>> FastInfosetException
>> {
>> buffer[0] = 0x00 | FIRST_BIT;
>> final int result = decoder.decodeIntegerIndexOnSecondBitTest();
>>
>> assertEquals(0x00, result);
>> }
>> // integer in range [65, 8256], ( [64, 8255] ) 13 bits
>> public void testIntegerIndex321() throws IOException,
>> FastInfosetException {
>> buffer[0] = 0x41 | FIRST_BIT;//100 0001 - last five bits
>> buffer[1] = 0x01;// 0000 0001 - eight following bits
>> final int result = decoder.decodeIntegerIndexOnSecondBitTest();
>>
>> assertEquals(257 + 64, result);
>> }
>> // integer in range [8257, 1048576], ( [8256, 1048575] ) 20 bits
>> public void testIntegerIndex73793() throws IOException,
>> FastInfosetException {
>> buffer[0] = 0x61 | FIRST_BIT;//110 0001 - last four bits
>> buffer[1] = 0x00;// 0000 0000 - eight following bits
>> buffer[2] = 0x01;// 0000 0001 - eight following bits
>> final int result = decoder.decodeIntegerIndexOnSecondBitTest();
>>
>> assertEquals(65537 + 8256, result);
>> }
>> }
>> =========================================================================
>>
>> # Another small issue can be found in
>> Decoder.decodeTableItems(QualifiedNameArray array, boolean isAttribute).
>> # Wrong:
>>
>> String namespaceName = "";
>> int namespaceNameIndex = -1;
>> if ((b & EncodingConstants.NAME_SURROGATE_NAME_FLAG) > 0) {
>> namespaceNameIndex = decodeIntegerIndexOnSecondBit();
>> namespaceName = *_v.prefix.get(prefixIndex);*
>> }
>>
>> # Correct: _v.prefix.get(prefixIndex); changed to
>> _v.namespaceName.get(namespaceNameIndex);
>>
>> String namespaceName = "";
>> int namespaceNameIndex = -1;
>> if ((b & EncodingConstants.NAME_SURROGATE_NAME_FLAG) > 0) {
>> namespaceNameIndex = decodeIntegerIndexOnSecondBit();
>> namespaceName = *_v.namespaceName.get(namespaceNameIndex);*
>> }
>>
>>
>> Cheers,
>> ~Andrzej
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_fi.dev.java.net
> For additional commands, e-mail: dev-help_at_fi.dev.java.net
>
>