Re: FI parser buffer

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Thu, 08 Sep 2005 10:47:17 +0200

Brian Pontarelli wrote:
>
>>
>> OK. That is interesting are you ensuring that the vocabularies of the
>> parsers and serializers between the peers are also shared? i.e. there
>> is no need to reset vocabularies per parse and serialization. The
>> parsers/serializers were designed to support such functionality, but
>> it is not so obvious how to enable it. I can send a further email
>> explaining this if you like.
>>
>> This has the advantage that message size is reduced and
>> parsing/serializing should be faster. However, i would not recommend
>> this for the case where the vocabulary can vary quite a lot. It is
>> good for small messages of whose vocabulary is similar for multiple
>> requests/responses.
>
>
> I think you are wondering if I'm using the same serializer and parser
> for the entire communication? If that is the question I'm not doing
> that. I have a thread pool where I grab a thread from to handle the
> request/response. This thread constructs a new parser/serializer.
>

But since you are using a stateful protocol you can get further
advantage from FI if you share the vocbulary over multiple messages to
reduce message size and increase serializing and parsing performance.

Note that creating a new parser/serializer per message is expensive, we
have found it is best to share a parser/serializer per thread.

I am making certain guesses as to your system since i do not know what
requirements you have. If you could explain a bit more about your system
(sending a private email is OK if you do not want to discuss publically)
i may be able to help you get further optimization.

>> Yes buffering, if the parser returns a string of X characters then
>> there needs to be some buffer holding the encoding of those X
>> characters. This is for performance reasons. Using a buffered input
>> stream method calls can be expensive when reading bytes and there are
>> optimizations that can be achieved when reading length prefixed data
>> by combining parsing and buffering. Methods get hotspoted and inlined
>> for example when using such techniques.
>
>
> Hmmmm... I'm interested in a few things about this statement, neither of
> which have to do with FI. How do you know that the methods are inlined
> during hotspot and which methods are you talking about, the
> BufferedInputStream methods or the FI parser methods?
>

Private methods will be inlined, if they get called enough, because they
are not virtual. Thus the Decoder.read method can be inlined. Note that
a lot of the java.io.InputStream method implements are implemented as
syncrhonized adding a further cost for just reading one byte.

>> Using the system property in the following manner should work:
>>
>> -Dcom.sun.xml.fastinfoset.parser.buffer-size=16
>>
>> However, the buffer could be getting resized as there might be encoded
>> strings greater than 16 bytes e.g. namespace names.
>
>
> I think it is being resized for larger parts of the document like the
> XML document declaration, but I found a way to use the FI parser without
> changing this flag.
>

Great.

>> The implementation of the decoder is very much dependent on reuse of
>> the buffer for the decoding of strings. It relies on the property that
>> the input stream is self-contained and contains one or more fast
>> infoset documents.
>>
>> I wonder if other XML parsers do similar things. IIRC Sun's JAXB
>> depends on their being one XML document per input stream, at least
>> when using the SAX parser.
>
>
> Oh yeah. JAXB was quite a trick. We had a complete implementation for
> JAXB and JAXB was quite picky so I had to write a quick XML stream
> reader that knew when the end of the XML doc was hit and could handle
> the end of the stream and all that magic properly. I think most parsers
> and tools not only rely on there being only a single document and the
> end of stream,

Exactly.

> but they also do bad things like close streams,
> especially during error cases.

Yes, that is bad.

> One of the HUGE reasons I picked this FI
> implementation rather than writing my own was that it doesn't man-handle
> the InputStream but can reasonably parse multiple FI documents on the
> same stream. Great work in that area by the way.
>

Thanks :-)

>> The tricky thing is how to implement without loosing the current
>> efficiencies. Not sure it can. Avoiding as many reads as possible to
>> the underlying stream is important for performance.
>
>
> Really? I not totally convinced that this is completely true for all
> implementations.

Profiling showed that depending on InputStream.read was an issue. I got
quite a performance boost when a merged buffering and parsing into the
Decoder. It is quite a common technique for improving performance.

The FI parser will tend to read a couple of individual bytes for
structure and then read a sequence of bytes for content (tags or text
content or attribute values). There is at least one read call per
element, attribute, tag, text content, and attribute value. That can add
up to a lot.

> I read in using NIO and then buffer that data into my
> own InputStream and allow reads from that in blocks or byte by byte. I'm
> not convinced that lots of method invocations to read byte by byte
> really slow it down that much.

Try measuring the performance :-) using ByteArrayInputStream.read and a
private method to read.

> My only performance concern is my use of
> volatile variable for the current buffer I read from NIO.

But i presume you are also using FI for performance reasons?

> This access
> might be slowed down, but once the NIO thread is finished and the parser
> is working, there is no contention. If the NIO thread and the parsing
> thread are working concurrently, then it should perform better under
> heavier load. This will even out the access between the thread
> performing the NIO and each of my "execute threads" parsing the FI
> documents via my InputStream implementation, since at most there are
> only two threads contending for the volatile variable of each
> InputStream and most likely the NIO thread will be working on an
> InputStream for a thread not being put on the CPU next.
>

OK.

>>
>>> One possible solution I've thought of is to make my own InputStream
>>> that can determine when the end of the FI document is encountered and
>>> can then pretend as though it has reached the end of stream and
>>> return -1 from the read operations. This seems slightly orthogonal to
>>> what the FI parser is doing for the most part (besides obviously
>>> calling to the ContentHandler). It really seems as though the FI
>>> implementation should either provide this type of InputStream, or be
>>> able to provide a good non-blocking solution to this type of scenario.
>>>
>>
>> I would prefer the latter if possible. I think it will be more
>> efficient and provide a clean layering.
>>
>> This is very much related to what type of transport is used to
>> communicate FI messages. Using an open TCP/IP connection directly for
>> communicating multiple messages is generally not recommended unless a
>> transport is layered on top that provides a framing mechanism (i.e. an
>> InputStream of the content). HTTP and BEEP [1] transports provide such
>> a mechanism.
>>
>> Microsoft proposed using DIME [2] as a light weight framing mechanism
>> alternative to HTTP for the transportation of SOAP messages using
>> TCP/IP or UDP. You may want to copy this approach if the HTTP protocol
>> or BEEP is too heavy for you. Note that DIME is not a standard.
>>
>> It says in [1] that BEEP adds "about 60 octets per exchange, , and is
>> designed to be simple to parse". I am not sure what the average for
>> HTTP would be, but for the basic header fields alone this could add up
>> to about 45 bytes so BEEP is probably more efficient in this respect.
>>
>
> I've actually gotten it working since the message I sent, but at random
> if fails because of the "TODO" line in the Decoder line 1245:
>
> // TODO keep reading until require bytes have been obtained
>
> The way I managed to get this to work was using the NIO and custom
> InputStream I mentioned above and allow the FI parser to read in blocks
> as it wants. I don't return all the bytes that the parser requests
> unless I know for certain I have that many bytes. This works well since
> the FI parser handles the majority of these cases well (sans the TODO I
> mentioned).
>
> Therefore, most of what you've mentioned is unnecessary considering that
> the FI parser handles most cases of block reads that read less bytes
> than the buffer length and the fact that the parser does a good job of
> determining the end of the document and stopping parsing and not closing
> the stream or doing anything else to the stream. Once the TODO is
> finished, I should be able to stream documents in both directions down a
> TCP/IP connection without any special protocol around the FI documents.
>

Ah, i understand!. I originally thought it would be more complex but
then i recalled that it is only the Decoder.setInputStream that resets
the buffer, thus it is possible to parse multiple times with input
stream. Doh! call it back from vacation slowness: i now remember making
sure such streaming functionality was possible when implementing.

I will implement the TODO. Could you log an Issue using the issue tracker:

http://fi.dev.java.net/servlets/ProjectIssues

Are you using the non-blocking features of NIO? When using such
streaming functionality one thing we need to watch out for is the read
being blocked while waiting for data that is not part of the current
document.

For example there could be the edge case where the Decoder.read will be
called for the termination of the document and this results in a
complete read of the buffer, which could block. Note that the
Decoder.read does not check for the case when zero bytes. The semantics
of InputStream.read say that at least one byte must be returned.

In general it seems that it should be possible to retain a reasonable
buffer size for efficient parsing assuming that the InputStream.read
returns partial data. i.e. it should not be necessary to modify the
buffer size.

> One question was why it isn't recommended to have an open connection
> that streams multiple FI documents in both directions without additional
> protocol semantics such as HTTP? I honestly can't see a reason unless
> the end of the FI document was non-deterministic, which it isn't (to my
> knowledge). The FI document actually forms the protocol itself in that
> it defines a fully encapsulated array of bytes that define a single
> message. The streams look like this in my case:
>
> [request4][request3][request2][request1]
> --------------------------------------->
> client server
> <--------------------------------------
> [response4][response3][response2][response1]
>
> Each message is stacked directly after the last byte of the previous
> message regardless of direction.
>

Errors in the encoding. How do you recover from an error in the stream?
It is tricky or impossible thus further requests or responses that have
been written may be lost. This is especially important when proxies are
involved.

For a private protocol i reckon it is OK, although i would tend to avoid
that approach myself for reasons that a good transport protocol offers
other advantages as well.

For a public protocol it is not the common practice, although an
exception to this is Jabber which AFAIK creates an open stream for a
continuous XML document. I do not know if the design decision behind
Jabber was in response to lack of keep alive support in HTTP servers or
for other reasons.

> Anyways, you thoughts and experience on this would be really helpful
> because we are putting a large investment into the FI protocol I've
> described and I want to ensure that it will work well and makes good sense.
>

I really appreciate you putting time into using Fast Infoset and
reporting the problems you find.

Paul.

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109