Re: Surplus checking of terminator byte[] performed by StringDecoder.parseWithTerminatingSeq

From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
Date: Thu, 11 Mar 2010 10:24:53 +0100

Hi Ming Qin,

I've commited the change we discussed [1].

Thank you!

WBR,
Alexey.

[1]

Author: oleksiys
Date: 2010-03-11 09:17:29+0000
New Revision: 4297

Modified:
   branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/
utils/StringDecoder.java

Log:
improve StringDecode to store the parsing state to avoid parsing byte
buffer from the beginning

thanks to Ming Qin for finding this out.

Modified: branches/2dot0/code/modules/grizzly/src/main/java/com/sun/
grizzly/utils/StringDecoder.java
Url: https://grizzly.dev.java.net/source/browse/grizzly/branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/utils/StringDecoder.java?view=diff&rev=4297&p1=branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/utils/StringDecoder.java&p2=branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/utils/StringDecoder.java&r1=4296&r2=4297
=
=
=
=
=
=
========================================================================
--- branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/
utils/StringDecoder.java (original)
+++ branches/2dot0/code/modules/grizzly/src/main/java/com/sun/grizzly/
utils/StringDecoder.java 2010-03-11 09:17:29+0000
@@ -125,7 +125,8 @@
         Integer stringSize = lengthAttribute.get(storage);

         if (logger.isLoggable(Level.FINE)) {
- logger.log(Level.FINE, "StringDecoder decode stringSize="
+ stringSize + " buffer=" + input + " content=" +
input.toStringContent());
+ logger.log(Level.FINE, "StringDecoder decode stringSize="
+ stringSize +
+ " buffer=" + input + " content=" +
input.toStringContent());
         }

         if (stringSize == null) {
@@ -190,6 +191,7 @@
                 offset = 0;
             }

+ lengthAttribute.set(storage, offset);
             return TransformationResult.<Buffer,
String>createIncompletedResult(
                     input);
         }

On Mar 11, 2010, at 1:12 , ming qin wrote:

> Hi Oleksiy Stashok:
> Setting up offset as offset = input.remaining() - checkIndex;
> will not work for terminator contains duplicated chars sitting
> close next to each other such as "zz\r\n". I withdraw my code
> submitting.
> offset =input.remaining()-terminationBytesLength is good one.
>
>
> Ming Qin
> Cell Phone 858-353-2839
>
> --- On Wed, 3/10/10, ming qin <mingqin1_at_yahoo.com> wrote:
>
> From: ming qin <mingqin1_at_yahoo.com>
> Subject: Re: Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq
> To: dev_at_grizzly.dev.java.net
> Date: Wednesday, March 10, 2010, 3:58 PM
>
> Hi Oleksiy Stashok::
> Kudos for adding magic line - lengthAttribute.set(storage, offset);
>
> For making terminator validation to start as closer as char last
> validating failure occurred,
> I modified one line offset = input.remaining() - checkIndex;
>
> I tested with terminator as "z\r\n" with TCPNIOTransport's
> DEFAULT_READ_BUFFER_SIZE = 3 or 2 or
>
>
>
>
> Index: StringDecoder.java
> ===================================================================
> --- StringDecoder.java (revision 4286)
> +++ StringDecoder.java (working copy)
> @@ -185,11 +185,11 @@
> return TransformationResult.<Buffer,
> String>createCompletedResult(
> stringMessage, input, false);
> } else {
> - offset = input.remaining() - terminationBytesLength;
> + offset = input.remaining() - checkIndex;
> if (offset < 0) {
> offset = 0;
> }
> -
> + lengthAttribute.set(storage, offset);
> return TransformationResult.<Buffer,
> String>createIncompletedResult(
> input);
> }
> Ming Qin
> Cell Phone 858-353-2839
>
> --- On Wed, 3/10/10, Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM> wrote:
>
> From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
> Subject: Re: Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq
> To: dev_at_grizzly.dev.java.net
> Date: Wednesday, March 10, 2010, 2:05 AM
>
> Hi Ming Qin,
>
> I guess the main issue you've found is that each time parsing
> process starts from index 0 and doesn't count with fact, that prev.
> time we might have parsed some chunk already.
> I'm proposing to add just this:
>
> Index: code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java
> ===================================================================
> --- code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (revision 4286)
> +++ code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (working copy)
> } else {
> offset = input.remaining() - terminationBytesLength;
> if (offset < 0) {
> offset = 0;
> }
>
> + lengthAttribute.set(storage, offset);
>
> return TransformationResult.<Buffer,
> String>createIncompletedResult(
> input);
> }
> This will let us store the current offset, where we stopped parsing
> and next time postpone at the same point.
>
> Parsing from the end to beginning also has its drawbacks, for
> example when I've read a big chunk of data (say 16K), but the string
> I need to parse is usually not longer than 16 symbols.
>
> What do you think?
>
> WBR,
> Alexey.
>
> On Mar 9, 2010, at 20:48 , ming qin wrote:
>
> > Hi Oleksiy Stashok:
> > Below is svn diff on StringDecoder.java version is Grizzly 2dot0
> >
> >
> >
> > Index: code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java
> > ===================================================================
> > --- code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (revision 4286)
> > +++ code/modules/grizzly/src/main/java/com/sun/grizzly/utils/
> StringDecoder.java (working copy)
> > @@ -154,8 +154,9 @@
> > protected TransformationResult<Buffer, String>
> parseWithTerminatingSeq(
> > AttributeStorage storage, Buffer input) {
> > final int terminationBytesLength =
> stringTerminateBytes.length;
> > - int checkIndex = 0;
> > -
> > + int checkIndexBackWard = stringTerminateBytes.length-1;
> > + int comparisonMarker =0;
> > +
> > int termIndex = -1;
> >
> > Integer offsetInt = lengthAttribute.get(storage);
> > @@ -164,17 +165,22 @@
> > offset = offsetInt;
> > }
> >
> > - for(int i = input.position() + offset; i < input.limit();
> i++) {
> > - if (input.get(i) == stringTerminateBytes[checkIndex]) {
> > - checkIndex++;
> > - if (checkIndex >= terminationBytesLength) {
> > - termIndex = i - terminationBytesLength + 1;
> > +
> > + for( int i= input.limit()-1 ; i >=0; i--) {
> > +
> > +
> > + if (input.get(i) ==
> stringTerminateBytes[checkIndexBackWard]) {
> > + checkIndexBackWard--;
> > + comparisonMarker++;
> > + if (comparisonMarker == terminationBytesLength) {
> > + termIndex = input.limit() -
> terminationBytesLength ;
> > break;
> > }
> > - }
> > + }else { break;}
> > }
> >
> > TransformationResult<Buffer, String> result;
> > +
> > if (termIndex >= 0) {
> > // Terminating sequence was found
> > int tmpLimit = input.limit();
> >
> >
> >
> > Ming Qin
> > Cell Phone 858-353-2839
> >
> > --- On Tue, 3/9/10, Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM> wrote:
> >
> > From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
> > Subject: Re: Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq
> > To: dev_at_grizzly.dev.java.net
> > Date: Tuesday, March 9, 2010, 9:04 AM
> >
> > Hi Ming Qin,
> >
> > can you pls. provide an SVN diff file? I will gladly apply your fix.
> >
> > Thank you very much!
> >
> > WBR,
> > Alexey.
> >
> > On Mar 7, 2010, at 6:20 , ming qin wrote:
> >
> >>
> >> Below is an implementation on method parseWithTerminatingSeq for
> class StringDecoder.java by taking terminator backward validation.
> >>
> >>
> >> Assumptions:
> >> string being read from channel contains one terminator.
> >>
> >> How to test:
> >> Setting up a small value for DEFAULT_READ_BUFFER_SIZE in
> TCPNIOTransport
> >> Here is my setting up:
> >> private static final int DEFAULT_READ_BUFFER_SIZE = 2;
> >>
> >>
> >>
> >> protected TransformationResult<Buffer, String>
> parseWithTerminatingSeq(
> >> AttributeStorage storage, Buffer input) {
> >> final int terminationBytesLength =
> stringTerminateBytes.length;
> >>
> >> int checkIndexBackWard = stringTerminateBytes.length-1;
> >>
> >> int termIndex = -1;
> >>
> >> Integer offsetInt = lengthAttribute.get(storage);
> >> int offset = 0;
> >> if (offsetInt != null) {
> >> offset = offsetInt;
> >> }
> >> int comparisonMarker =0;
> >>
> >>
> >>
> >> for( int i= input.limit()-1 ; i >=0; i--) {
> >>
> >>
> >> if (input.get(i) ==
> stringTerminateBytes[checkIndexBackWard]) {
> >> checkIndexBackWard--;
> >> comparisonMarker++;
> >> if (comparisonMarker == terminationBytesLength) {
> >> termIndex = input.limit() -
> terminationBytesLength ;
> >> break;
> >> }
> >> }else { break;}
> >> }
> >>
> >>
> >> //rest part of codes is remained as intact as original ones.
> >>
> >>
> >> }
> >>
> >>
> >>
> >>
> >>
> >>
> >> Ming Qin
> >> Cell Phone 858-353-2839
> >>
> >> --- On Fri, 3/5/10, ming qin <mingqin1_at_yahoo.com> wrote:
> >>
> >> From: ming qin <mingqin1_at_yahoo.com>
> >> Subject: Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq
> >> To: dev_at_grizzly.dev.java.net
> >> Date: Friday, March 5, 2010, 10:45 PM
> >>
> >>
> >> Surplus checking of terminator byte[] performed by
> StringDecoder.parseWithTerminatingSeq method
> >>
> >>
> >> Symptom:
> >>
> >> if size of byte[] read from channel is bigger than value
> of DefaultMemoryManager’s DEFAULT_MAX_BUFFER_SIZE,
> parseWIthTerminatingSeq will be called more than 1 time to determine
> TransformationRequit’s status as COMPLETE. Unfortunately, <Buffer>
> input is not individual chunked byte[] being read from polling
> iteration based on current SelectionKey’s OP_READ, instead
> <Buffer>input is a composite buff[] which comprises cumulative
> chucks of byte[] occurred from the 1st , 2nd… to the current
> SelectionKey’s OP_READ.
> >>
> >> The consequence of this design leads low efficiency ( time
> consuming) to determine ending of intended read-in string.
> >>
> >> For example : DEFAULT_MAX_BUFFER_SIZE=2, StringFilter’s
> terminator is =’\r\n”, Inputting Sting is =”ABCD\r\n”
> >> In an idea testing setting up, parseWithTerminatingSeq
> will be called four times and each time consists of iterations of
> char comparison against fixed terminator chars
> >>
> >> The 1st time, <Buffer> input contains “AB” , checking
> terminator for the 1st time .
> >> The 2nd time, <Buffer> input contains “ABC” checking
> terminator for the 2nd time , but it run from char A to C , instead
> starting from char C.
> >> The 3rd time, <Buffer> input contains “ABCD\r” checking
> terminator for the 3nd time , but it run from char A to \r , instead
> starting from char D.
> >> The 4th time, <Buffer> input contains “ABCD\r\n” checking
> terminator for the 4th time , it run from char A to \n , Eventually,
> it works since if it starts from “\n” ( newly added char from
> SelectionKey OP_READ), it will go forever.
> >>
> >>
> >>
> >> Here is my proposed backward validating of terminator,
> >>
> >> Terminator sting =”\r\n”, <Buffer> input is a composite buffer[]
> >>
> >> The 1st time, <Buffer> input contains “AB”, checking
> terminator backward for the 1st time .
> >> Chr B is validated against “\n”, failure -> polling
> >>
> >> The 2nd time, <Buffer> input contains “ABC”, checking
> terminator backward for the 1st time .
> >> Chr C is validated against ‘\n”, failure->polling
> >>
> >>
> >> The 3rd time, <Buffer> input contains “ABCD\r”, checking
> terminator backward for the 1st time .
> >>
> >> Char \r is validated against ‘\n” failure-Polling
> >>
> >> The 4th time, <Buffer> input contains “ABCD\r\n”, checking
> terminator backward for the 1st time .-> Stopping polling -> next
> Filter.
> >>
> >>
> >> Chr \n is validate against “\n”, success
> >> Chr \r validated against “\r”, success.
> >>
> >>
> >>
> >> Ming Qin
> >> Cell Phone 858-353-2839
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: dev-help_at_grizzly.dev.java.net
>