users@jersey.java.net

[Jersey] Re: Async example misleading?

From: Marek Potociar <marek.potociar_at_oracle.com>
Date: Fri, 12 Sep 2014 21:43:41 +0200

You mean something like this: https://github.com/jersey/jersey/tree/master/examples/rx-client-webapp ?

The code is still steaming-fresh and I'm sure that Michal, who designed it will appreaciate any feedback... ;-)

Marek


On 07 Sep 2014, at 17:29, Mikael Ståldal <mikael.staldal_at_appearnetworks.com> wrote:

> Each and every operation async is similar to what's called Reactive programming (http://www.reactivemanifesto.org/).
>
> That would be facilitated by an easy way to connect an async method in Jersey-server with an async outbound call with Jersey-client.
>
> I guess what's needed is a way to automatically resume an async server method with the completion of a Future. Like what you can do with Play Framework.
>
> On Fri, Sep 5, 2014 at 8:16 PM, Kishore Senji <ksenji_at_gmail.com> wrote:
> Thank you Marek.
>
> I agree to all your points and I did not say there is no advantage to Async. I'm only referring to the example that users might think that throughput would increase just by taking the processing to a different thread. Throughput would only increase when that veryExpensiveOperation() method is actually doing async IO. If it is only cpu bound, then yes the container threads can take more requests (and they can also serve other resource methods) but the requests to this resource method will still be queued up and the worker threads are all busy working (spiking cpu) which will impact the overall system performance. [Typically we have few methods related to a domain deployed to a pool. For the client they can all be under one end point, internally routed to the appropriate pool via ESB]. Even if the veryExpensiveOperation() is IO bound, the worker threads are blocked waiting for the IO. This will queue up the tasks and the worker threads cannot do any more work as they are blocked waiting for IO. This pool of workers cannot be used for other resource methods (let us say they also do async but have a different profile of relatively short cpu bound tasks or quick IO) and they may have to be configured to use a different thread pool etc.
>
> In short, only when each and every operation in the call stack is async (servlet needs to be async capable, then the database driver needs to support async or the service call this service makes needs to be done on async io) then only we can have throughput benefits (and support same volume of traffic with less vms) otherwise having async at one layer (Jersey/servlet) will not help when the actual database/service call is blocking.
>
> Thanks,
> Kishore.
>
>
> On Fri, Sep 5, 2014 at 9:30 AM, Marek Potociar <marek.potociar_at_oracle.com> wrote:
>
> On 04 Sep 2014, at 21:27, Kishore Senji <ksenji_at_gmail.com> wrote:
>
>> Hi All,
>>
>> The Async example is given at https://jersey.java.net/documentation/latest/async.html
>>
>> "However, in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used. In this model, the association between a request processing thread and client connection is broken. I/O container that handles incoming request may no longer assume that a client connection can be safely closed when a request processing thread returns. Instead a facility for explicitly suspending, resuming and closing client connections needs to be exposed. Note that the use of server-side asynchronous processing model will not improve the request processing time perceived by the client. It will however increase the throughput of the server, by releasing the initial request processing thread back to the I/O container while the request may still be waiting in a queue for processing or the processing may still be running on another dedicated thread. The released I/O container thread can be used to accept and process new incoming request connections."
>>
>> If veryExpensiveOperation() is expensive and is taking long time, then having it run in a different thread and releasing the request processing thread back to the I/O container, how would that improve the throughput?
>
> You are off-loading the I/O container threads, which are typically taken from a limited thread pool. If an I/O processing thread is blocked waiting, it cannot process new connections.
>
>>
>> If that is the case we can as well increase the number of request processing threads of the I/O container by the number of worker threads that we would use in the case of the example and not worry about Async at all.
>
> Please note that different resource methods may have different requirements. You typically want to configure your I/O thread pool size to match number of CPU cores (or sometimes CPU cores + c, where c is a constant < than number of cores). And then you want to make sure that only short computations are performed on these threads, so e.g. typically anything that may involve any I/O operation (disk, db, network) should better be coded as async, where thread context switch cost is offset by the overall operation cost (see also here). Typically, also these operations tend to have specific execution characteristics, so a use of a dedicated thread pool with a separately tuned pool size is required to fine-tune the performance of the system.
>
> So advantage of using async API is that it gives you a much more fine-grained control over when the operation is delegated to a different thread pool as well as to which thread pool should the operation be delegated to, which is in contrast with your "one size fits all approach", which does nothing else then introduces the high probability of L1, L2 and L3 cache misses with every new request.
>
>> We can take more and more connections and have them queue up (or would end up with creating many worker threads), but it would not necessarily increase throughput. It would increase throughput if the veryExpensiveOperation() is doing I/O over a Socket and if we use Async IO for that operation, then we can use minimal request threads and very small worker thread pool to do Async handling of the IO (or combine logic across multiple Service calls doing non-blocking IO, similar to Akka futures). This will improve the throughput as more work is done. But without non-blocking IO, if the veryExpensiveOperation() is either CPU bound or using blocking IO then the worker thread would infact be blocked for that time and we would end up with huge thread pool or a big queue of tasks waiting. Huge thread pool would not scale and big queue would also reduce the throughput.
>
> If you have an application, where the only service is the veryExpensiveOperation() resource method, then use of async is not likely to help. But frankly, how typical is that case? Often you have other services that would starve unnecessarily if you did not off-load the veryExpensiveOperation() to another thread pool.
>
>>
>> Nevertheless we definitely need a thread to take the processing to a different thread so that the container thread can be returned quickly. But is my understanding correct that it depends on what veryExpensiveOperation() does (blocking or non-blocking IO, or totally CPU bound computation etc) to actually improve the throughput?
>
> See above. I would say it does not depend on it. Obviously, in some cases (I/O) you would probably see better results than in others (CPU-intensive computation), and again it also depends on the overall context - other resources you need to serve, etc.
>
> Marek
>
> P.S. Interestingly, I've been just involved in a discussion, where the problem is that in some complex distributed systems you may start seeing cycles in the call graph. And if such system is implemented using synchronous APIs, a high system load can lead to thread pool exhaustion, which then leads to an inevitable system deadlock. This is another reason why esp. with any remote IO the use of async code is your best bet.
>
>>
>> Thanks,
>> Kishore.
>
>
>
>
>
> --
> Mikael Ståldal
> Chief Software Architect
> Appear
> Phone: +46 8 545 91 572
> Email: mikael.staldal_at_appearnetworks.com