[Jersey] Re: Async example misleading?

From: Robert DiFalco <robert.difalco_at_gmail.com>
Date: Fri, 5 Sep 2014 09:18:44 -0700

Kishore,

I use this pattern for a client app, not web pages. Note that I can have
literally millions of clients. You may not have this requirement.

R.

On Thu, Sep 4, 2014 at 2:48 PM, Kishore Senji <ksenji_at_gmail.com> wrote:

> Thank you for your opinions.
>
> The pattern that Robert mentioned is good, but it may be a good solution
> for end user web pages like flight search etc. For the Rest services, isn't
> that the service is a stateless operation which gives the output for a
> given input. That is there is a contract. The solution seems to break that
> principle, meaning that there is an intermediate step for the client to
> interact with.
>
> Because the client has to wait anyway for the expensive operation to
> complete (with async or polling option), it shouldn't matter if the
> connection/socket is held. Also we typically have persistent connections
> between servers (Front end app server talking to a Rest service pool will
> have persistent connections)
>
> I was referring to an use case where a REST service calls multiple service
> calls (or Async IO enabled back-end systems) and aggregates/applies some
> logic on top of those service calls and then gives a response back to its
> clients. In this case having Async capability definitely helps as the
> interface for the client does not change (clients makes a rest call and
> gets a Typed response back) so the programming model does not change. For
> the server because of using Async and interacting with other systems using
> non-blocking IO will definitely improve the throughput of the server. (So
> we can have less server instances handling the volume of requests than we
> would need if we did not have Async)
>
> It would be good if the documentation can be corrected otherwise users can
> assume that if they take the processing to another thread then it will
> improve overall throughput.
>
> Thanks,
> Kishore.
>
>
>
> On Thu, Sep 4, 2014 at 12:58 PM, cowwoc <cowwoc_at_bbs.darktech.org> wrote:
>
>> +1
>>
>> Sounds like a good way to go.
>>
>> Gili
>>
>>
>> On 04/09/2014 3:52 PM, Robert DiFalco wrote:
>>
>> I tend not to use the async stuff at all for expensive operations. I want
>> to spin up as many rest server processes as I need to for scalability. So
>> the pattern I use is to return a SEE ALSO redirect to another URL that can
>> be polled until a result shows up. The steps are basically like this:
>>
>> 1. REST request comes in for a long running operation.
>> 2. Create a UUID for the operation's result.
>> 3. Give the client a URL + '/' + UUID to poll for the result.
>> 4. Server dispatches a job for the operation letting the client
>> return immediately
>> 5. When the job is done the result is written to REDIS with the key
>> to the result being the UUID we gave to the client.
>> 6. The polling routine simply polls until either a TIMEOUT is reached
>> or REDIS has a value at the UUID.
>>
>> In this way there is no shared state in memory and you can spin up as
>> many REST servers as you need to. The async stuff in Jersey is not very
>> valuable to me because I don't want to tie up a connection nor require the
>> result to be at the same server as the request.
>>
>> Just FWIW. You're requirements may be different. The above is just a
>> simple pattern I follow for all long live operations (about >1s).
>>
>>
>>
>>
>>
>> On Thu, Sep 4, 2014 at 12:27 PM, Kishore Senji <ksenji_at_gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> The Async example is given at
>>> https://jersey.java.net/documentation/latest/async.html
>>>
>>> "However, in cases where a resource method execution is known to take
>>> a long time to compute the result, server-side asynchronous processing
>>> model should be used. In this model, the association between a request
>>> processing thread and client connection is broken. I/O container that
>>> handles incoming request may no longer assume that a client connection can
>>> be safely closed when a request processing thread returns. Instead a
>>> facility for explicitly suspending, resuming and closing client connections
>>> needs to be exposed. Note that the use of server-side asynchronous
>>> processing model will not improve the request processing time perceived by
>>> the client. *It will however increase the throughput of the server, by
>>> releasing the initial request processing thread back to the I/O container
>>> while the request may still be waiting in a queue for processing or the
>>> processing may still be running on another dedicated thread*. The
>>> released I/O container thread can be used to accept and process new
>>> incoming request connections."
>>>
>>> If veryExpensiveOperation() is expensive and is taking long time, then
>>> having it run in a different thread and releasing the request processing
>>> thread back to the I/O container, how would that improve the throughput?
>>>
>>> If that is the case we can as well increase the number of request
>>> processing threads of the I/O container by the number of worker threads
>>> that we would use in the case of the example and not worry about Async at
>>> all.
>>>
>>> We can take more and more connections and have them queue up (or would
>>> end up with creating many worker threads), but it would not necessarily
>>> increase throughput. It would increase throughput if the
>>> veryExpensiveOperation() is doing I/O over a Socket and if we use Async IO
>>> for that operation, then we can use minimal request threads and very small
>>> worker thread pool to do Async handling of the IO (or combine logic across
>>> multiple Service calls doing non-blocking IO, similar to Akka futures).
>>> This will improve the throughput as more work is done. But without
>>> non-blocking IO, if the veryExpensiveOperation() is either CPU bound or
>>> using blocking IO then the worker thread would infact be blocked for that
>>> time and we would end up with huge thread pool or a big queue of tasks
>>> waiting. Huge thread pool would not scale and big queue would also reduce
>>> the throughput.
>>>
>>> Nevertheless we definitely need a thread to take the processing to a
>>> different thread so that the container thread can be returned quickly. But
>>> is my understanding correct that it depends on what
>>> veryExpensiveOperation() does (blocking or non-blocking IO, or totally CPU
>>> bound computation etc) to actually improve the throughput?
>>>
>>> Thanks,
>>> Kishore.
>>>
>>
>>
>>
>