[jsr339-experts] Re: [jax-rs-spec users] Async API and Threads again

From: Marek Potociar <marek.potociar_at_oracle.com>
Date: Mon, 15 Oct 2012 04:06:54 +0200

On Oct 13, 2012, at 11:03 PM, Jan Algermissen <jan.algermissen_at_nordsc.com> wrote:

> Hi,
>
> bear with me - I am still trying to wrap my head around async API issues.
>
> Having thought about it for a while I am thinking right now that using the Async API actually only makes sense if the async response is parked for a while. The important aspect being having eventually more responses than threads dealing with them.
>
> In other words, it IMHO makes no sense to have an async response be handled by another thread (e.g. via @Asynchronous) right away because that would also consume one thread per response.
>
> I fail to see the difference between using more 'backend' threads in order to have less http handling threads. One could equally well just increase the http pool max size.
>
> Or yet in other words: If one does not park more than one response in a single thread the number of used threads will simply O(n) increase with the number of requests. (Which to prevent is the very reason of the Async API AFAIU).
>
> Or is there anything in any of the containers around that makes an Http handling thread any different from another thread?

Yes, there is a quite significant difference.

Whenever you use thread pools (I'm talking about bound thread pools as unlimited ones are very rarely used in production for stability reasons) you want to make sure that all of your tasks take approximately same time to finish. So if you happen to have a few long-running tasks, that take a lot longer to finish compared to the rest of the tasks, and you use the same shared thread pool also for those long running tasks, you'll soon see as the client requests keep coming more frequently that even if only a small portion of the requests hit the long-running tasks, most of the threads become occupied by these long running task and the throughput of the system plummets as if all of your tasks were long running.

Typically, when tuning for best performance, it is a good strategy to group tasks by the time they take to finish as well as based on how much CPU intensive are they and how much I/O they need to provide and then assign separate thread pools to each group so that you can:

1. make sure that performance of tasks in group A does not impact performance of group B
2. fine-tune the thread pools based on the specific aspects (amount of I/O or other blocking, CPU intensity, ...) of the tasks running in the thread pool.

If you want to learn more, I can recommend an excellent "Java Concurrency in Practice" book. See Chapter 6 for the discussion related to this particular case.

>
> Bottom line of all this being: I think the mandatory advice on using the Async API would be: store more than one response in a single thread until response processing can be started. We should see a collection-type variable holding async responses. Otherwise the effect of the Async API would be zero at beast (given its own overhead)

The advice doesn't seem to be backed by any solid reasoning. What you're really worried about is if your tasks take same amount of time if run by the same thread (pool). The thread context switching is typically the last thing you should be worried about these days. Also, I'm not convinced that "batching responses" will have any positive impact on the client-perceived throughput or average request-response roundtrip time. I even suspect that the outcome might be exactly opposite as you would have wanted.

>
> Agreed?

Disagreed :)

Marek
>
>
> Jan