[Jersey] Re: Async example misleading?

From: Robert DiFalco <robert.difalco_at_gmail.com>
Date: Thu, 4 Sep 2014 12:52:06 -0700

I tend not to use the async stuff at all for expensive operations. I want
to spin up as many rest server processes as I need to for scalability. So
the pattern I use is to return a SEE ALSO redirect to another URL that can
be polled until a result shows up. The steps are basically like this:

   1. REST request comes in for a long running operation.
   2. Create a UUID for the operation's result.
   3. Give the client a URL + '/' + UUID to poll for the result.
   4. Server dispatches a job for the operation letting the client return
   immediately
   5. When the job is done the result is written to REDIS with the key to
   the result being the UUID we gave to the client.
   6. The polling routine simply polls until either a TIMEOUT is reached or
   REDIS has a value at the UUID.

In this way there is no shared state in memory and you can spin up as many
REST servers as you need to. The async stuff in Jersey is not very valuable
to me because I don't want to tie up a connection nor require the result to
be at the same server as the request.

Just FWIW. You're requirements may be different. The above is just a simple
pattern I follow for all long live operations (about >1s).

On Thu, Sep 4, 2014 at 12:27 PM, Kishore Senji <ksenji_at_gmail.com> wrote:

> Hi All,
>
> The Async example is given at
> https://jersey.java.net/documentation/latest/async.html
>
> "However, in cases where a resource method execution is known to take a
> long time to compute the result, server-side asynchronous processing model
> should be used. In this model, the association between a request processing
> thread and client connection is broken. I/O container that handles incoming
> request may no longer assume that a client connection can be safely closed
> when a request processing thread returns. Instead a facility for explicitly
> suspending, resuming and closing client connections needs to be exposed.
> Note that the use of server-side asynchronous processing model will not
> improve the request processing time perceived by the client. *It will
> however increase the throughput of the server, by releasing the initial
> request processing thread back to the I/O container while the request may
> still be waiting in a queue for processing or the processing may still be
> running on another dedicated thread*. The released I/O container thread
> can be used to accept and process new incoming request connections."
>
> If veryExpensiveOperation() is expensive and is taking long time, then
> having it run in a different thread and releasing the request processing
> thread back to the I/O container, how would that improve the throughput?
>
> If that is the case we can as well increase the number of request
> processing threads of the I/O container by the number of worker threads
> that we would use in the case of the example and not worry about Async at
> all.
>
> We can take more and more connections and have them queue up (or would end
> up with creating many worker threads), but it would not necessarily
> increase throughput. It would increase throughput if the
> veryExpensiveOperation() is doing I/O over a Socket and if we use Async IO
> for that operation, then we can use minimal request threads and very small
> worker thread pool to do Async handling of the IO (or combine logic across
> multiple Service calls doing non-blocking IO, similar to Akka futures).
> This will improve the throughput as more work is done. But without
> non-blocking IO, if the veryExpensiveOperation() is either CPU bound or
> using blocking IO then the worker thread would infact be blocked for that
> time and we would end up with huge thread pool or a big queue of tasks
> waiting. Huge thread pool would not scale and big queue would also reduce
> the throughput.
>
> Nevertheless we definitely need a thread to take the processing to a
> different thread so that the container thread can be returned quickly. But
> is my understanding correct that it depends on what
> veryExpensiveOperation() does (blocking or non-blocking IO, or totally CPU
> bound computation etc) to actually improve the throughput?
>
> Thanks,
> Kishore.
>