Hi Robert,
Some responses embedded below.
charlie ...
Robert Greig wrote:
> On 24/05/07, Jeanfrancois Arcand <Jeanfrancois.Arcand_at_sun.com> wrote:
>
>> The slides were just posted from this Java One session claiming Grizzly
>> blows MINA away performance-wise, and I'm just curious as to people's
>> views
>> on it. They present some interesting ideas about optimizing selector
>> threading and ByteBuffer use.
>
> Hi,
>
> This topic is one I am very interested in. Apologies in advance for
> this rather long post.
>
> I am one of the original developers of what is now Apache Qpid, the
> message broker that implements the AMQP protocol.
>
> When I started building Qpid, which was around two years ago Grizzly
> was not available (at least I don't think it was) and I picked MINA
> really as a way of simplifying development in the early stages. Once
> it developed beyond a prototype, performance became more important
> (particularly since we originally targetted high performance transient
> pub/sub messaging) and we did quite a lot of work to analyse the
> performance and tune it. A lot of that tuning was not surprisingly in
> MINA.
Makes perfect sense to have chosen MINA at the time when you started
Qpid development. As I have said many times, (including in our Grizzly
presentations), MINA is a good framework. It's main emphasis is on
being a "general purpose framework". In contrast, Grizzly's main
emphasis is performance. However, we believe we can achieve "easy to
use APIs" and not sacrifice performance. Notice too though that "easy
to use APIs" does not necessarily imply a "general purpose framework".
>
> When I heard about grizzly while on a visit to Burlington I intended
> to do some comparative tests between it and MINA but my involvement in
> Qpid has reduced significantly and I have no longer had much time to
> devote to that task for various reasons. Having said that, I am very
> interested in the Grizzly team's opinion on the MINA-based
> architecture we use in Qpid. The MINA architecture has always slightly
> intrigued me since it does not match anything I have read in books or
> papers (although that could simply be my ignorance).
>
> I should state that the benchmarking I did with Qpid showed that it
> had very good performance compared with other message brokers. We did
> have to tweak MINA because of several characteristics we found
> undesirable (which we did submit to the MINA team although I believe
> they chose not to implement them all so the current MINA release
> probably does not match what is used by Qpid).
It's unfortunate MINA did not choose to accept all of the performance
enhancements you provided for them.
>
> This page:
> http://cwiki.apache.org/confluence/display/qpid/Qpid+Design+-+Threading
> shows the Qpid threading model which also includes MINA. It is
> slightly out of date as I describe below but is accurate in the main
> aspects.
>
> The key bit that I haven't seen other I/O frameworks adopting is the
> SocketIOProcessor. In MINA, each connection is assigned a socket io
> processor (on a round robin basis, typically we have one socket io
> processor per CPU core), which is responsible for polling each socket
> and reading data up to the configured size of the read buffer, before
> handing this off to another thread pool. In Qpid, we changed this so
> that each socket io processor is split into a read thread and a write
> thread so that we can do concurrent reads and writes on a given socket
> (not shown in the diagram). This approach is to me very different from
> the leader-follower model described in so many texts.
The leader-follower model is a very common model. We have found that
the leader follower may not be best performing model.
There's nothing in Grizzly to necessarily prevent anyone from doing
concurrent reads / writes to the connection. But, since the framework
grew from an implementation of an HTTP connector, it initially dealt
with reading incoming requests and then responding to that request on
the same thread. I know with Grizzly 1.5 there's nothing that would
prevent you from being able to read / write concurrently to the same
connection at the same time. In fact, the GlassFish Corba team (which I
work with very closely too) has been wanting to use Grizzly for its
transport layer under IIOP. They will want to be able read and write to
the same connection concurrently since multiple ORB clients will be
multiplexed over the same connection. So, we know it something we
cannot prevent anyone from being able to do with Grizzly.
>
> Does anyone have any insights into how this might compare with other
> approaches?
An approach that I find has been working very well on the reading side
general approach where upon receiving a read event notification, read as
much data as can be read into a ByteBuffer. Then, ask a message parser
that knows how to parse the data just read into messages. As messages
are parsed give those messages to a protocol processor. If you are left
with a partial message as the last message in your ByteBuffer you
continue to try to read more data. This is a condition I call,
"expecting more data". As long as you are "expecting more data", you
use a temporary Selector to wait for more data. When you are no longer
"expecting more data", you can then consider being done with the overall
read event. There's some additional variations one can incorporate too
such as ... distributed network applications tend to be bursty, for that
reason you might consider adding to the definition "expecting more data"
the notion of waiting a few more seconds before considering the overall
read event being done.
The writing side is a little more interesting. One could consider
putting outbound messages into a queue structure and have an writing
thread waiting for data on the queue to be written and doing scatter
writes when more than one entry is on the queue at a given time. This
approach has its advantages and challenges, as you well know. And,
there's also the approach that a thread that has formulated a response
or constructed an outbound message simply just invokes the connection write.
>
> Are there any further details on the performance tests that were run?
I'm assuming you're asking about the comparison of the MINA versus
Grizzly performance tests ?
On slide 17 of the JavaOne presentation, (which you can download at
http://developers.sun.com/learning/javaoneonline/j1sessn.jsp?sessn=TS-2992&yr=2007&track=5),
you'll see a description of what and how we tested. Then, those tests
were performed against AsyncWeb. One with AsyncWeb built on MINA and the
other as AsyncWeb built on Grizzly. Faban is simply a load generator.
There's nothing magical about it. It doesn't have logic in that asks the
server if it's running Grizzly and auto-magically enables a "run really
fast switch" :-D
Surprisingly think time is key to measuring HTTP performance and can
often lead observers to improper conclusions. For example to expect a
web server to have thousands of connected users with 0 think time is
unrealistic. :-)
Performance testing message brokers is a little different ;-)
Throughput and scalability are both equally crucial.
Jeanfrancois can give you more details about AsyncWeb.
> A message broker has certainly got very different characteristics to
> an HTTP server (e.g. far fewer but much longer connections) plus AMQP
> is designed so you always know exactly how many bytes you need to read
> off the wire to complete a "command".
>
> Are there any message brokers implementations on top of Grizzly?
At the moment I don't know of any message brokers built on top of
Grizzly. The Sun message broker product team has been involved with
some of our Project Grizzly meetings prior to when we open sourced
Project Grizzly. I know there is interest in using Grizzly. I also
know they have had a full plate with other commitments so migrating to
using Grizzly has taken a lower priority.
>
> Thanks for reading,
Thank you for asking!
hths,
charlie ....
>
> Robert
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: dev-help_at_grizzly.dev.java.net
>
--
Charlie Hunt
Java Performance Engineer
630.285.7708 x47708 (Internal)
<http://java.sun.com/docs/performance/>