dev@fi.java.net

Re: FI parser buffer

From: Brian Pontarelli <brian_at_pontarelli.com>
Date: Sat, 10 Sep 2005 10:16:58 -0500

>
> But since you are using a stateful protocol you can get further
> advantage from FI if you share the vocbulary over multiple messages to
> reduce message size and increase serializing and parsing performance.
>
> Note that creating a new parser/serializer per message is expensive,
> we have found it is best to share a parser/serializer per thread.
>
> I am making certain guesses as to your system since i do not know what
> requirements you have. If you could explain a bit more about your
> system (sending a private email is OK if you do not want to discuss
> publically) i may be able to help you get further optimization.

This will all be open and available shortly, so I don't mind discussing
details here. The re-use of the serializers and handlers seems like a
great idea and since I have transactional context during a session I
could add those Objects to that class. I envision it should be really
straight forward since my protocol uses request/response paradigm such
the client won't won't write a second request to the server until the
server has completely returned the response from the first request.

Looks like this:

client server
   | ----------------->|
   | <-----------------|
   | ----------------->|
   | <-----------------|
   | ----------------->|
   | <-----------------|

Eventually I'd like to add in talk back from the server, but I'm
actually thinking of using an additional connection for that, so this
paradigm would still be used, just with the request originating from the
server.

> Private methods will be inlined, if they get called enough, because
> they are not virtual. Thus the Decoder.read method can be inlined.
> Note that a lot of the java.io.InputStream method implements are
> implemented as syncrhonized adding a further cost for just reading one
> byte.

That's good to know. Do you know of any resources for more information
about the HotSpot compilation process? In terms of InputStream, I've
overridden nearly everything in there for that reason. I can get away
with a single volatile and a blocking queue instead. Only if the
volatile is null or empty (ByteBuffer) do I access the blocking queue.
Reduces the overhead quite a bit. Bulk reads are actually the best (as
you've mentioned) since they only access the volatile ByteBuffer once.

> Profiling showed that depending on InputStream.read was an issue. I
> got quite a performance boost when a merged buffering and parsing into
> the Decoder. It is quite a common technique for improving performance.
>
> The FI parser will tend to read a couple of individual bytes for
> structure and then read a sequence of bytes for content (tags or text
> content or attribute values). There is at least one read call per
> element, attribute, tag, text content, and attribute value. That can
> add up to a lot.

Good to know. I'll make sure I account for that in the future when I'm
handling IO that ever can block or synchronize. Do you know if the same
holds true for other non-synchronized, non-blocking methods? I would
imagine not as much of a performance drain except the obvious need to
construct and tear down the stack frame for the method invocation.

>> I read in using NIO and then buffer that data into my own InputStream
>> and allow reads from that in blocks or byte by byte. I'm not
>> convinced that lots of method invocations to read byte by byte really
>> slow it down that much.
>
>
> Try measuring the performance :-) using ByteArrayInputStream.read and
> a private method to read.

I guess my thought was that if all I'm doing is reading from a volatile
variable in a method, if I call that method lots of times, how much
worse is that than calling a single method that just reads a chunk from
the volatile variable. I've gotten away from most of the InputStream
semantics for synchronization and such, so I guess I'll have to profile
this and see what the volatile variable case looks like.

>> My only performance concern is my use of volatile variable for the
>> current buffer I read from NIO.
>
>
> But i presume you are also using FI for performance reasons?

Oh yeah! Base 64 encoding sucks real bad. 500 times decrease in
performance for encoding/decoding a 14k image. The protocol originally
used JAXB and XML, but I did some tests and since the server had to
encode the image and the client had to decode, the performance was
awful. So, I switched to FI so I don't ever have to encode anything! I
love it!

> Are you using the non-blocking features of NIO? When using such
> streaming functionality one thing we need to watch out for is the read
> being blocked while waiting for data that is not part of the current
> document.
>
> For example there could be the edge case where the Decoder.read will
> be called for the termination of the document and this results in a
> complete read of the buffer, which could block. Note that the
> Decoder.read does not check for the case when zero bytes. The
> semantics of InputStream.read say that at least one byte must be
> returned.
>
> In general it seems that it should be possible to retain a reasonable
> buffer size for efficient parsing assuming that the InputStream.read
> returns partial data. i.e. it should not be necessary to modify the
> buffer size.

Yeah, I'm a HUGE advocate of non-blocking I/O. I don't mind other APIs
that block since the NIO selector I've implemented won't ever block. I
push the bytes that were read from the Channel into my custom NIO
InputStream. Other threads that are working on that stream can be
blocked waiting from more data, but as long as they can handle as little
as 1 byte reads and also handle the End of Stream read (i.e. -1), they
should be fine. I can send you snippets of code to look over and see if
you can find any performance sinks. Let me know.


> Errors in the encoding. How do you recover from an error in the
> stream? It is tricky or impossible thus further requests or responses
> that have been written may be lost. This is especially important when
> proxies are involved.
>
> For a private protocol i reckon it is OK, although i would tend to
> avoid that approach myself for reasons that a good transport protocol
> offers other advantages as well.
>
> For a public protocol it is not the common practice, although an
> exception to this is Jabber which AFAIK creates an open stream for a
> continuous XML document. I do not know if the design decision behind
> Jabber was in response to lack of keep alive support in HTTP servers
> or for other reasons.

Great question because this was a huge problem for me. I assume that if
FI encounters a bad encoding, it will throw an exception. I think in
most cases this is true. In this case, I catch the exception and tear
down the entire conversation with the client. I assume that the client
will understand it when the server says, "dude, you just sent some bad
stuff my way and all I can do is cut you off without any response."

However, even if it is not the case and FI for whatever reason can't
determine that what it has already is bad, there are a few cases.

1. The client sends more bytes across. In this case I keep feeding FI
until it either throws an exception, or blocks again waiting for more bytes.
2. The client doesn't send anything more. Since my NIO selector is truly
non-blocking, in this case, I set a timeout that says, I've read in some
stuff, the client hasn't sent anymore in X seconds and the parser
couldn't figure out how to handle what the client sent, therefore I
figure that the message is corrupt and tear down the conversation with
the client.

So, no matter what the case, I eventually stop talking to the client and
the client will see this as an end of stream when they go to read or an
end of stream when they go to write some more. Both cases the client can
handle however they want.

The only case I'm not certain that this will cover is the case of
proxies. I think that the proxy will still see the end of stream and
handle it fine. This all really comes down to the request/response
paradigm I'm using. If the protocol was truly bi-directional, then yeah,
all the other messages would be lost. Luckily, I can guarantee that no
more messages have been sent yet since the server hasn't finished
processing the current one and hasn't yet sent a response.

Again, let me know if you see holes in this. Also, if you would like to
participate on the RFC for the protocol specification (nothing through a
standards body, we are just going to publish it to our website and open
up an email address or mailing list for comments), let me know and I'll
email you when it comes out (hopefully in the next week or two).

-bp