Re: Surplus checking of terminator byte[] performed by StringDecoder.parseWithTerminatingSeq

From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
Date: Wed, 10 Mar 2010 11:05:40 +0100

Hi Ming Qin,

I guess the main issue you've found is that each time parsing process
starts from index 0 and doesn't count with fact, that prev. time we
might have parsed some chunk already.
I'm proposing to add just this:

Index: code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
StringDecoder.java
===================================================================
--- code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
StringDecoder.java (revision 4286)
+++ code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
StringDecoder.java (working copy)
         } else {
             offset = input.remaining() - terminationBytesLength;
             if (offset < 0) {
                 offset = 0;
             }

+ lengthAttribute.set(storage, offset);

             return TransformationResult.<Buffer,
String>createIncompletedResult(
                     input);
         }
This will let us store the current offset, where we stopped parsing
and next time postpone at the same point.

Parsing from the end to beginning also has its drawbacks, for example
when I've read a big chunk of data (say 16K), but the string I need to
parse is usually not longer than 16 symbols.

What do you think?

WBR,
Alexey.

On Mar 9, 2010, at 20:48 , ming qin wrote:

> Hi Oleksiy Stashok:
> Below is svn diff on StringDecoder.java version is Grizzly 2dot0
>
>
>
> Index: code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java
> ===================================================================
> --- code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (revision 4286)
> +++ code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (working copy)
> @@ -154,8 +154,9 @@
> protected TransformationResult<Buffer, String>
> parseWithTerminatingSeq(
> AttributeStorage storage, Buffer input) {
> final int terminationBytesLength =
> stringTerminateBytes.length;
> - int checkIndex = 0;
> -
> + int checkIndexBackWard = stringTerminateBytes.length-1;
> + int comparisonMarker =0;
> +
> int termIndex = -1;
>
> Integer offsetInt = lengthAttribute.get(storage);
> @@ -164,17 +165,22 @@
> offset = offsetInt;
> }
>
> - for(int i = input.position() + offset; i < input.limit(); i+
> +) {
> - if (input.get(i) == stringTerminateBytes[checkIndex]) {
> - checkIndex++;
> - if (checkIndex >= terminationBytesLength) {
> - termIndex = i - terminationBytesLength + 1;
> +
> + for( int i= input.limit()-1 ; i >=0; i--) {
> +
> +
> + if (input.get(i) ==
> stringTerminateBytes[checkIndexBackWard]) {
> + checkIndexBackWard--;
> + comparisonMarker++;
> + if (comparisonMarker == terminationBytesLength) {
> + termIndex = input.limit() -
> terminationBytesLength ;
> break;
> }
> - }
> + }else { break;}
> }
>
> TransformationResult<Buffer, String> result;
> +
> if (termIndex >= 0) {
> // Terminating sequence was found
> int tmpLimit = input.limit();
>
>
>
> Ming Qin
> Cell Phone 858-353-2839
>
> --- On Tue, 3/9/10, Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM> wrote:
>
> From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
> Subject: Re: Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq
> To: dev_at_grizzly.dev.java.net
> Date: Tuesday, March 9, 2010, 9:04 AM
>
> Hi Ming Qin,
>
> can you pls. provide an SVN diff file? I will gladly apply your fix.
>
> Thank you very much!
>
> WBR,
> Alexey.
>
> On Mar 7, 2010, at 6:20 , ming qin wrote:
>
>>
>> Below is an implementation on method parseWithTerminatingSeq for
>> class StringDecoder.java by taking terminator backward validation.
>>
>>
>> Assumptions:
>> string being read from channel contains one terminator.
>>
>> How to test:
>> Setting up a small value for DEFAULT_READ_BUFFER_SIZE in
>> TCPNIOTransport
>> Here is my setting up:
>> private static final int DEFAULT_READ_BUFFER_SIZE = 2;
>>
>>
>>
>> protected TransformationResult<Buffer, String>
>> parseWithTerminatingSeq(
>> AttributeStorage storage, Buffer input) {
>> final int terminationBytesLength =
>> stringTerminateBytes.length;
>>
>> int checkIndexBackWard = stringTerminateBytes.length-1;
>>
>> int termIndex = -1;
>>
>> Integer offsetInt = lengthAttribute.get(storage);
>> int offset = 0;
>> if (offsetInt != null) {
>> offset = offsetInt;
>> }
>> int comparisonMarker =0;
>>
>>
>>
>> for( int i= input.limit()-1 ; i >=0; i--) {
>>
>>
>> if (input.get(i) ==
>> stringTerminateBytes[checkIndexBackWard]) {
>> checkIndexBackWard--;
>> comparisonMarker++;
>> if (comparisonMarker == terminationBytesLength) {
>> termIndex = input.limit() -
>> terminationBytesLength ;
>> break;
>> }
>> }else { break;}
>> }
>>
>>
>> //rest part of codes is remained as intact as original ones.
>>
>>
>> }
>>
>>
>>
>>
>>
>>
>> Ming Qin
>> Cell Phone 858-353-2839
>>
>> --- On Fri, 3/5/10, ming qin <mingqin1_at_yahoo.com> wrote:
>>
>> From: ming qin <mingqin1_at_yahoo.com>
>> Subject: Surplus checking of terminator byte[] performed by
>> StringDecoder.parseWithTerminatingSeq
>> To: dev_at_grizzly.dev.java.net
>> Date: Friday, March 5, 2010, 10:45 PM
>>
>>
>> Surplus checking of terminator byte[] performed by
>> StringDecoder.parseWithTerminatingSeq method
>>
>>
>> Symptom:
>>
>> if size of byte[] read from channel is bigger than value of
>> DefaultMemoryManager’s DEFAULT_MAX_BUFFER_SIZE,
>> parseWIthTerminatingSeq will be called more than 1 time to
>> determine TransformationRequit’s status as COMPLETE.
>> Unfortunately, <Buffer> input is not individual chunked byte[]
>> being read from polling iteration based on current SelectionKey’s
>> OP_READ, instead <Buffer>input is a composite buff[] which
>> comprises cumulative chucks of byte[] occurred from the 1st , 2nd…
>> to the current SelectionKey’s OP_READ.
>>
>> The consequence of this design leads low efficiency ( time
>> consuming) to determine ending of intended read-in string.
>>
>> For example : DEFAULT_MAX_BUFFER_SIZE=2, StringFilter’s
>> terminator is =’\r\n”, Inputting Sting is =”ABCD\r\n”
>> In an idea testing setting up, parseWithTerminatingSeq will
>> be called four times and each time consists of iterations of char
>> comparison against fixed terminator chars
>>
>> The 1st time, <Buffer> input contains “AB” , checking
>> terminator for the 1st time .
>> The 2nd time, <Buffer> input contains “ABC” checking terminator
>> for the 2nd time , but it run from char A to C , instead starting
>> from char C.
>> The 3rd time, <Buffer> input contains “ABCD\r” checking
>> terminator for the 3nd time , but it run from char A to \r ,
>> instead starting from char D.
>> The 4th time, <Buffer> input contains “ABCD\r\n” checking
>> terminator for the 4th time , it run from char A to \n ,
>> Eventually, it works since if it starts from “\n” ( newly added
>> char from SelectionKey OP_READ), it will go forever.
>>
>>
>>
>> Here is my proposed backward validating of terminator,
>>
>> Terminator sting =”\r\n”, <Buffer> input is a composite buffer[]
>>
>> The 1st time, <Buffer> input contains “AB”, checking
>> terminator backward for the 1st time .
>> Chr B is validated against “\n”, failure -> polling
>>
>> The 2nd time, <Buffer> input contains “ABC”, checking
>> terminator backward for the 1st time .
>> Chr C is validated against ‘\n”, failure->polling
>>
>>
>> The 3rd time, <Buffer> input contains “ABCD\r”, checking
>> terminator backward for the 1st time .
>>
>> Char \r is validated against ‘\n” failure-Polling
>>
>> The 4th time, <Buffer> input contains “ABCD\r\n”, checking
>> terminator backward for the 1st time .-> Stopping polling -> next
>> Filter.
>>
>>
>> Chr \n is validate against “\n”, success
>> Chr \r validated against “\r”, success.
>>
>>
>>
>> Ming Qin
>> Cell Phone 858-353-2839
>>
>