Feel free to jump in :-)
Fyi, scatter writes is something I wanna come back to someday :-)
I've talked with Scott Oaks about it too and he said the same thing as
you said. I think Harsha did some micro-benchmarking with it too and
observed the same thing.
But, when I've looked at the source code for scatter writes for Solaris
or Linux, it should perform better. But, the key is to use direct
ByteBuffers. Heap ByteBuffer scatter write performance will be
terrible. I can tell that by looking at the implementation of scatter
writes.
If indeed scatter writes with direct ByteBuffer is worse than doing
multiple writes with the same number of direct ByteBuffers, then it's a
bug and I'll make we get it addressed in Java SE. :-)
charlie ...
Jeanfrancois Arcand wrote:
> Jumping into the discussion :-)
>
> Robert Greig wrote:
>> On 25/05/07, charlie hunt <charlie.hunt_at_sun.com> wrote:
>>
>>> An approach that I find has been working very well on the reading side
>>> general approach where upon receiving a read event notification,
>>> read as
>>> much data as can be read into a ByteBuffer. Then, ask a message parser
>>> that knows how to parse the data just read into messages. As messages
>>> are parsed give those messages to a protocol processor. If you are
>>> left
>>> with a partial message as the last message in your ByteBuffer you
>>> continue to try to read more data. This is a condition I call,
>>> "expecting more data".
>>
>> This is essentially what we do in Qpid, in MINA terminology this is
>> the "CumulativeProtocolDecoder".
>>
>> In our broker we have certain cases where messages coming in can go
>> straight out to one or more consumers so we take care in that decoder
>> not to compact() the bytebyffer in the event you get a number of
>> unprocessed bytes at the end. Avoiding the compact means we can just
>> take a slice() of the data when handing it off to the next stage.
>>
>>> As long as you are "expecting more data", you
>>> use a temporary Selector to wait for more data. When you are no longer
>>> "expecting more data", you can then consider being done with the
>>> overall
>>> read event.
>>
>> What is the motivation for that? In Qpid, the SocketIOProcessor is
>> constantly constantly reading off the socket(s) it is responsible for.
>> What is the cost of creating a selector and tearing it down?
>
> In Grizzly, we initialize at startup a pool of temporary Selector.
> Those Selectors are available for read/write. As an example, I
> frequently use temporary Selectors when handling the http protocol.
> With HTTP, it is is difficult to detect the end of the stream. One way
> to handle the problem is by parsing the message, getting the
> content-length and try to read bytes until you reach the
> content-length value. This approach (which is what MINA does...at
> least last time I've looked) is to register the SelectionKey back to
> the main Selector, cache/persist the ByteBuffer until all the bytes
> are read. Since most of the time the Selector is not running on the
> same thread (WorkerThread) as the class that parse the bytes, the cost
> of switching the thread can be expensive. Hence instead this approach,
> you stay on the WorkerThread and when no more bytes are available
> (sochetChannel.read returns 0), you register the channel to a
> temporary Selector and try to read from it. Most of the time you will
> be able to read the extra bytes. See [1] for more info.
>
>
>>
>>> There's some additional variations one can incorporate too
>>> such as ... distributed network applications tend to be bursty, for
>>> that
>>> reason you might consider adding to the definition "expecting more
>>> data"
>>> the notion of waiting a few more seconds before considering the overall
>>> read event being done.
>>
>> Do you build anything in to ensure fairness/avoid starvation?
>
> The timeout you set on the temporary Selector.select() is quite
> important. Why, because if all WorkerThread are blocked because no
> bytes are available, then you hit a DOS style attack. In grizzly http,
> the timeout is low (you can always fall back to the main Selector in
> case the temporary Selector failed to read bytes). If no bytes are
> read from a temporary Selector, then we just close the connection (for
> http, as this is a denial attack).
>
>
>>
>>> The writing side is a little more interesting. One could consider
>>> putting outbound messages into a queue structure and have an writing
>>> thread waiting for data on the queue to be written and doing scatter
>>> writes when more than one entry is on the queue at a given time. This
>>> approach has its advantages and challenges, as you well know.
>>
>> Yes, the queue approach is what we do in Qpid.
>>
>> I actually spent a reasonable amount of time trying to get a
>> measurable improvement using scattering writes (and gathering reads
>> too) but I was unable to get any measurable improvement.
>
> Same here. For HTTP, I've try scattering writes and it proved to be
> slower.
>
>>
>> We found some interesting characteristics of the ConcurrentLinkedQueue
>> when looking at this. It didn't seem to perform too well when the
>> queue was often empty so in the end I believe we actually just used a
>> standard LinkedList with sychronized blocks around the accessors.
>
> Interesting....our default Thread pool in Grizzly (the one that
> perform the best) is based on a LinkedList. I've try many
> java.util.concurrent approaches but they are always slower.
>
>>
>>> And,
>>> there's also the approach that a thread that has formulated a response
>>> or constructed an outbound message simply just invokes the
>>> connection write.
>>
>> Presumably it would then have to deal with issues like the kernel
>> buffer being full etc.
>>
>>> Performance testing message brokers is a little different ;-)
>>> Throughput and scalability are both equally crucial.
>>
>> Yes, and for some applications latency can be an issue and getting
>> that right can be a challenge. Other message brokers we tested came
>> close to Qpid in some throughput tests but had horrific latency.
>>
>> One other thing we found with Qpid was that direct buffers were
>> significantly slower than heap buffers and that pooling buffers
>> (something that MINA can do) was counterproductive if you use heap
>> buffers. Do you use heap or direct buffers?
>
> We use view (slice) of a large heap byte buffer. We have reported the
> issue to the JDK team and 1.7 will fix the problem you (and us) are
> seeing with direct byte buffer.
>
> BTW the cometd sub project [2] is a kind of message bus. I've already
> looked at Qpid to see if I can use it under the hood. I'm swamped
> those day but this is something I would like to explore :-)
>
> -- Jeanfrancois
>
> [1]
> http://weblogs.java.net/blog/jfarcand/archive/2006/07/tricks_and_tips_4.html
>
> [2]
> http://weblogs.java.net/blog/jfarcand/archive/2007/02/gcometd_introdu_1.html
>
>
>
>>
>> RG
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: dev-help_at_grizzly.dev.java.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: dev-help_at_grizzly.dev.java.net
>
--
Charlie Hunt
Java Performance Engineer
630.285.7708 x47708 (Internal)
<http://java.sun.com/docs/performance/>