Thoughts on the Connection Caching tutorial

From: Ken Cavanaugh <Ken.Cavanaugh_at_Sun.COM>
Date: Wed, 17 Oct 2007 13:31:19 -0700

Charlie,

Here is part of what I think is missing from the Tutorial. I am adding parenthetical
remarks <<<like this>>>
in this where I know other people need to add comments or help with
the details. I know exactly how the connection caches work, but I am not so
certain about the integration with Grizzly.

The tutorial also needs to add calls to some of the connection cache methods
that are currently missing.

Connection Caching in Grizzly

Grizzly provides connection caching for those protocols that require it.
Connection caching is available for both client-side/connector use (the
outbound connection cache), and server-side/acceptor use (the inbound
connection cache). Using the connection cache requires some attention to
the interaction between the protocol and the grizzly transport, because
the transport layer does not have enough information to completely
control the connection cache. We'll examine the issues for the inbound and
outbound caches separately. First, we need to look at some common
characteristics of protocols supported by the connection cache.

Protocol Issues

Many protocols have a number of similar requirements

A single request may require sending several separate message fragments on a connection. This is to allow more efficient buffering when sending large requests.
A connection may be shared by several simultaneous requests. In the most complex case, the protocol may allow interleaving fragements from several messages.
A protocol may expect that a response for a request uses the same connection that the request used. This is certainly not always the case: message oriented middleware generally just sends a message, and there may not even be a direct response.

These requirements have a strong impact on the design of the cache. In particular, IIOP
has all 3 requirements.

The Inbound Connection Cache

The inbound connection cache needs to provide a means to obtain a connection
to a transport endpoint (internally referred to as a ContactInfo), and also
needs to manage the connection cache to avoid holding onto too many open
connections. But closing connections can be dangerous: the cache cannot close
a connection that is still in use. This leads to the following basic API:

Connection get( ContactInfo cinfo )
void release( Connection conn, int numResponseExpected )
void responseReceived( Connection conn ).

The get method is used to obtain a connection. It may reuse an existing connection,
or create a new one, according to the following algorithm:

If there is an idle connection for cinfo (that is, one which has been released as many times as it has been returned from get), return it.
Otherwise, create a new connection, if there are not already too many connections for the ContactInfo (this is controlled by configuration parameters)
Otherwise, return a busy connection (one that is not idle).

In addition, get will always return a valid connection (even if this causes the configuration
parameters to be temporarily violated), UNLESS it cannot open a connection for the
ContactInfo, in which case an IOException is thrown.

The basic per-request use of the cache is as follows:

Call get to obtain the Connection. In grizzly, this is done in CacheableConnectorHandler.connect.
Send messages for the request.
Call release on the connection, specifying the number of expected responses (usually 0 or 1).
After all responses have been received, call responseReceived.

The calls to release and response recevied MUST be handled in the Grizzly client, because
Grizzly knows nothing about the protocol details. Failure to call these method properly
will either result in connections accumulating in the cache, or premature release of
connections.

In addition, it is possible for multiple threads in the client to share the same connection.
In this case, it is the client's responsibility to make sure that two clients do not attempt
to simultaneously call write. However, the release and responseReceived methods are
guaranteed to be thread safe.

<<<I'm having a lot of trouble figuring out what's going on in the code here: does Grizzly
already prevent this? This will be an issue for CORBA integration of the connection
handler>>>

<<<The tutorial code examples need to be modified to include the release
and responseReceived method calls>>>

<<<I am deliberately leaving out some details at this point: the ConnectionFinder and the
details of the connection cache configuration>>>

The Outbound Connection Cache

The outbound connection cache is similar but somewhat simpler, because server-side
connection are passively accepted, rather than being created at the client's request.
This results in a slightly different API, still with 3 methods:

requestReceived( Connection conn )
requestProcessed( Connection conn, int numResponseExpected )
responseSent( Connection conn )

When a request is received, the requestReceived method must be called to
inform the cache about the connection (which may or may not already be in the
cache).

<<<this happens in CacheableSelectionKeyHandler.process, but I don't know
exactly how this gets called from the Grizzly user's perspective>>>

Once the user's code finishes reading the message from the connection,
the user needs to call requestProcessed, indicating the number of
responses we expect to send (again, usually 0 or 1).
Once the user's code finishes sending the responses, it must call responseSent.

<<<I see that both requestProcessed and responseSent are called from
CacheableSelectionKeyHandler.postProcess. Is that actually correct?
In the CORBA case, we don't want to call requestProcessed until we
have read all of the message fragments, and we can't call responseSent
until we have asynchronously processed the request and sent the response,
so I can't see how this can be called from postProcess.>>>

<<<I am not sure that we actually need both requestProcessed and responseSent.
We may be able to eliminate one method here.>>>

That's about all I have for now. Let's discuss this over
the email. Harsha will need to look at this in detail after he finishes
getting the tests running in the current version, and integrating with SSL (which
may take some time).

Ken.