Re: Asynchronous request processing for ... RMI

From: Ken Cavanaugh <kencavanaugh_at_mac.com>
Date: Wed, 07 Jul 2010 00:45:39 -0700

On Jul 6, 2010, at 2:22 AM, Leo Romanoff wrote:

>
>>>> It would certainly be possible to implement async RMI-IIOP directly at the
>>>> level of
>>>> the ORB, using syntax similar to that supported by the async EJB
>>>> feature.
>>>> A good implementation should avoid blocking threads while waiting for a
>>>> response
>>>> (even an internal implementation thread is probably too much in some
>>>> cases).
>>>> The basic need here is to modify the dynamic stub generator slightly,
>>>> introducing a new
>>>> API for the async case.
>>>
>>>> The hard part is modifying the client ORB code to save and restore the
>>>> per-request state.
>>>
>>> Can you elaborate a bit more on this? I'm not so deep into RMI
>>> implementations (yet ;-) to understand the problem that you describe
>>> here.
>
>> Actually I was thinking about the client side above, which is interesting,
>> but
>> the server side is much more important for scalability considerations.
>> But I'm not sure how to do this: what is your long-running server method
>> doing?
>> Is it computing something expensive, in which case a thread is required
>> anyway?
>> Or is it making further remote requests (perhaps expensive database
>> queries?)
>
> In my case, it mostly makes further remote requests.
> But, IMHO, it does not and should not matter. Even if it would compute
> something expensive, I'd like to do it on a different thread pool which is
> under my control. So, yes, the thread will be still occupied, but it would
> my thread behaving according to my policy, which in my case is a big
> advantage over e.g. (EJB) container managed threads. What I want to stress
> is that it is not only pure optimization. It is also about who controls
> multi-threading and processing.

OK, but in that case no optimization is possible. Basically what happens in
RMI-IIOP is there is a dispatch operation to the skeleton (we use a reflective
implementation), and the skeleton dispatches to the actual method. It is likely
possible to do something like (this is a VERY rough sketch):

Determine whether the method (or perhaps the entire interface) is async. If so:
Obtain a dispatch thread through some extensible mechanism, such as a callback
This needs to be user-definable. Typically this is done through a POA policy. Other mechanisms are possible.
Save the important state of the request (mainly some thread locals, and the connection used for the request)
Create a dispatch object representing this state, which supports continuing the dispatch
Hand the dispatch object to the thread, and start it running the dispatch method
Save the object in the ORB server code, for monitoring and management, and also for handling the response
When the thread completes execution:
Invoke a completion method on the dispatch object, which will:
Find the dispatch object in the server's registry and update it
get the reply (exception or normal) from the dispatch object
handle various pieces of state bookkeeping (POA/POAImpl state management, portable interceptors)
Create a response message, and send it back on the same connection the request was received
The connection can be released at this point (which allows for the case where multiple pending request exist on the same connection)

This then would allow you to control how dispatch threads are managed. You could also do things like refuse to
issue a thread, in which case some sort of exception should be sent back to the client, perhaps indicating to
try again later, or some other policy.

Note that this is an ORB level solution, which would still support things like failover and other enterprise features.
It is NOT an EJB-level solution. I can't say what this would look like in the EJB case.

>> I've seen some of this, but I think they generally require extensive
>> bytecode
>> transformation across the entire execution path, which may be impractical,
>
> Well, those transformations are usually done just once (sometimes even
> statically) and then re-used. So, I'd assume it is somewhat comparable to
> run-time stub generation. But of course some performance testing would be
> useful here.

Performance doesn't concern me here, or complexity of the bytecode transformation.
I've done quite a bit of bytecode manipulation in the ORB and its libraries for various reasons.
I am concerned about either finding all of the code that requires transformation at build
time (GF 3.x is based on OSGi bundles, and many of the bundles come from entirely
different workspaces and builds) or integrating another ClassLoader or ClassTransformer into GF.

>
>> There is also Javaflow, which could be good enough eventually.
>
> Javaflow is a pure java library. It does all its byte-code instrumentation
> at run-time, more precisely at class loading time, AFAIK. It's performance
> is probably not the best for very computing intensive tasks, but for remote
> invocations it could be eventually OK. Plus, it is not optimized at all yet
> (i.e. it instruments too many call sites, even if it is not necessary). One
> day, I tried to optimize it just for fun, and by the end of the day, the
> overhead was reduced by at least 50%. And I'm sure it can be improved even
> further.
>
> Kilim uses static instrumentation at the moment. But it does very good
> analysis of control and data flow and really optimizes the amount of data to
> be stored for a full continuation. It also delivers quite impressive
> performance.
>

Interesting. You've spent more time looking at continuation support in Java than I have.
But the bottom line on continuations is basically the complexity of adding the support to something
like GF. It might be more doable at just the ORB level.

Thanks,

Ken.