Re: Read/write concurrency questions

From: charlie hunt <charlie.hunt_at_sun.com>
Date: Fri, 06 Jul 2007 07:41:53 -0500

Jeanfrancois Arcand wrote:
> Hi,
>
> D. J. Hagberg (Sun) wrote:
>> So I think I have my asynch. write setup working correctly once I
>> started trying to ensure that only a single task was able to do SSL
>> writes to a single SelectionKey at any one point in time. I've
>> stepped through the SelectorHandler/ProtocolChain/SSLReadFilter code
>> enough now that I'm convinced it's only possible for it to do a
>> single read on a SelectionKey at any one point in time, too.
>>
>> But now I'm puzzling over a few places in the Grizzly code that are
>> starting to look strange to me. The main one is this:
>>
>> In the main loop on Controller.doSelect is this chunk:
>>
>> if (key.isValid()) {
>> if ((key.readyOps() & SelectionKey.OP_ACCEPT)
>> == SelectionKey.OP_ACCEPT){
>> // ... accept-handling stuff
>> } else if ((key.readyOps() & SelectionKey.OP_READ)
>> == SelectionKey.OP_READ) {
>> delegateToWorkerThread = selectorHandler.
>> onReadInterest(key,serverCtx);
>> } else if ((key.readyOps() & SelectionKey.OP_WRITE)
>> == SelectionKey.OP_WRITE) {
>> delegateToWorkerThread = selectorHandler.
>> onWriteInterest(key,serverCtx);
>> } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
>> == SelectionKey.OP_CONNECT) {
>> delegateToWorkerThread = selectorHandler.
>> onConnectInterest(key,serverCtx);
>> }
>>
>> if (delegateToWorkerThread){
>> Context context = pollContext(key);
>> context.setProtocol(selectorHandler.protocol());
>> context.setPipeline(selectorHandler.pipeline());
>> context.execute();
>> }
>> } else {
>> selectionKeyHandler.cancel(key);
>> }
>>
>> It looks to me like the (probably unlikely) case where a SelectionKey
>> has *both* a read-ready and a write-ready state is not handled here.
>> It looks like OP_READ will *always* win and
>> SelectorHandler.onWriteInterest will not be called until 1000ms later
>> or the next time the Selector is being woken up and, only then, if
>> there is not an outstanding OP_READ on the socket.
>>
>> I noticed this because when I put my application under significant
>> load I started building up a significant queue of outstanding write's.
>>
>> Is this a correct analysis?
>
> I'm not sure it is a bug as if you spawn two threads (let say there is
> no if/else in the code above), it means the OP_READ and OP_WRITE might
> be concurrently proceeded. Can it cause problems? I've no object to
> change the Controller code, but I just want to make sure it will not
> cause more problems.

If we consider the (basic) two configurations of handle OP events,
handled by a worker thread or by a selector thread ....

If you have a no if / else when delegating to a worker thread for read /
write OP events, the multiple OP events would be handled (almost)
concurrently since the OP_READ will be delegated to a worker thread
immediately and within a few cpu instructions the OP_WRITE will be
delegated to a different worker thread. This likely means that you may
be reading and writing to the same SocketChannel at the same time. For
non-SSL communications this is ok. But, I don't know if that will
present any issues when using SSL. It shouldn't, but I thought it is
worth asking.

Btw, the delegating OP_READ and OP_WRITE to worker threads is the
default configuration.

If the configuration is such that OP_READ / OP_WRITE are not delegated,
having a no if / else means that the OP_READ will finish first and then
the OP_WRITE will execute. Although I can't think of a situation where
one would run such a configuration, it is possible to configure.

At any rate, with the current if / else structure, the OP_WRITE will
either have to wait for the next cycle of Selector.select() which will
fall through right away since OP_WRITE hasn't been disabled. But, with
just an if structure the OP_WRITE would be executed without having to
incur the additional Selector.select() cycle. From a performance
perspective, if there's no additional OP events on that next
Selector.select() cycle, we could be spending some precious system /
kernel cpu time in that Selector.select() that could be avoided with the
non if / else structure.

If you make the change to using the non if / else structure, you might
wanna do a quick Grizzly / Faban performance test, but I think more
importantly think about any potential issues of using that structure
given the alternative "delegate to worker thread" versus "execute in the
Selector thread" configuration for each of the OP events.

At the moment, I'm thinking the non if / else structure may be a more
"fair" approach and may be slightly better performing. But, I'm also
thinking I might have overlooked something in the "delegate to worker
thread" versus "execute in the Selector thread" configuration using that
structure.

charlie ...

>
> Thanks
>
> -- Jeanfrancois
>
>
>>
>> Thanks,
>>
>> -=- D. J.
>>
>> PS, For details, I implemented my async write handling by having my
>> own class that overrides TCPSelectorHandler with its own:
>>
>> public boolean onWriteInterest(SelectionKey key, Context ctx) {
>> // thread-safe call to set up a latch that prevents calls to
>> // SelectorHandler.register(key,OP_WRITE) until write task
>> // has completed.
>> myTaskHandler.flagWriteReady(key);
>>
>> // disable OP_WRITE on key before submitting to task queue
>> key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));
>>
>> // submit write task to its own Pipeline
>> myTaskHandler.submitWriteTask(key);
>> }
>>
>> Then in any background tasks I have that post messages to the write
>> queue, I:
>>
>> - make a synchronized call into the task handler that sets a
>> "needsWrite" flag. That call will ONLY be successful if there is
>> currently a task waiting to start (based on the call to
>> flagWriteReady above) or if a write task is currently running.
>>
>> - if the call to set needsWrite was unsuccessful, it makes a call to
>> SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
>> woken up and tries to look for OP_WRITE on the key.
>>
>> - an async write task runs in its own pipeline that does
>> message-to-bytes encoding and then bytes-to-SSL encoding. At the end
>> of that task in a finally block, it does an atomic check/reset of the
>> needsWrite flag and, if so, calls back to
>> SelectorHandler.register(key, OP_WRITE) to start the process over again.
>>
>> I haven't yet been able to streamline the logic more than this. It
>> seems a bit convoluted, but does test correctly now. The unfortunate
>> side-effect is that it occasionally means that an async write task is
>> created and started even though the last one actually emptied the
>> write queue... It's a "safe" operation, but adds unnecessary overhead.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>

-- 
Charlie Hunt
Java Performance Engineer
630.285.7708 x47708 (Internal)
<http://java.sun.com/docs/performance/>