users@grizzly.java.net

Read/write concurrency questions

From: D. J. Hagberg (Sun) <"D.>
Date: Thu, 05 Jul 2007 16:43:20 -0600

So I think I have my asynch. write setup working correctly once I
started trying to ensure that only a single task was able to do SSL
writes to a single SelectionKey at any one point in time. I've stepped
through the SelectorHandler/ProtocolChain/SSLReadFilter code enough now
that I'm convinced it's only possible for it to do a single read on a
SelectionKey at any one point in time, too.

But now I'm puzzling over a few places in the Grizzly code that are
starting to look strange to me. The main one is this:

In the main loop on Controller.doSelect is this chunk:

     if (key.isValid()) {
         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
                 == SelectionKey.OP_ACCEPT){
             // ... accept-handling stuff
         } else if ((key.readyOps() & SelectionKey.OP_READ)
                 == SelectionKey.OP_READ) {
             delegateToWorkerThread = selectorHandler.
                     onReadInterest(key,serverCtx);
         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
                 == SelectionKey.OP_WRITE) {
             delegateToWorkerThread = selectorHandler.
                     onWriteInterest(key,serverCtx);
         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
                 == SelectionKey.OP_CONNECT) {
             delegateToWorkerThread = selectorHandler.
                     onConnectInterest(key,serverCtx);
         }

         if (delegateToWorkerThread){
             Context context = pollContext(key);
             context.setProtocol(selectorHandler.protocol());
             context.setPipeline(selectorHandler.pipeline());
             context.execute();
         }
     } else {
         selectionKeyHandler.cancel(key);
     }

It looks to me like the (probably unlikely) case where a SelectionKey
has *both* a read-ready and a write-ready state is not handled here. It
looks like OP_READ will *always* win and SelectorHandler.onWriteInterest
will not be called until 1000ms later or the next time the Selector is
being woken up and, only then, if there is not an outstanding OP_READ on
the socket.

I noticed this because when I put my application under significant load
I started building up a significant queue of outstanding write's.

Is this a correct analysis?

Thanks,

                        -=- D. J.

PS, For details, I implemented my async write handling by having my own
class that overrides TCPSelectorHandler with its own:

     public boolean onWriteInterest(SelectionKey key, Context ctx) {
         // thread-safe call to set up a latch that prevents calls to
         // SelectorHandler.register(key,OP_WRITE) until write task
         // has completed.
         myTaskHandler.flagWriteReady(key);

         // disable OP_WRITE on key before submitting to task queue
         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));

         // submit write task to its own Pipeline
         myTaskHandler.submitWriteTask(key);
     }

Then in any background tasks I have that post messages to the write
queue, I:

- make a synchronized call into the task handler that sets a
"needsWrite" flag. That call will ONLY be successful if there is
currently a task waiting to start (based on the call to flagWriteReady
above) or if a write task is currently running.

- if the call to set needsWrite was unsuccessful, it makes a call to
SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
woken up and tries to look for OP_WRITE on the key.

- an async write task runs in its own pipeline that does
message-to-bytes encoding and then bytes-to-SSL encoding. At the end of
that task in a finally block, it does an atomic check/reset of the
needsWrite flag and, if so, calls back to SelectorHandler.register(key,
OP_WRITE) to start the process over again.

I haven't yet been able to streamline the logic more than this. It
seems a bit convoluted, but does test correctly now. The unfortunate
side-effect is that it occasionally means that an async write task is
created and started even though the last one actually emptied the write
queue... It's a "safe" operation, but adds unnecessary overhead.