[Jersey] Re: Async example misleading?

From: cowwoc <cowwoc_at_bbs.darktech.org>
Date: Thu, 04 Sep 2014 19:17:02 -0400

Yes and no.

Consider the fact that multiple requests could resolve to the same
(ongoing) operation. After all, you don't want to rerun computationally
expensive operations (e.g. tripadvisor.com must get many duplicate
searches which they can reroute to the same URI).

Another point is that the returned URI could return a progress report
every time you poll it. I once implemented a REST protocol for a system
that scanned the network for a specific kind of device (an operation
that took from 30 seconds to 5 minutes depending on your network). If
only a single client made the request, I could display a progress bar.
When multiple clients made the request, they could piggyback on someone
else's request. It was great :)

I agree the documentation should be improved. I suggest filing a bug
report otherwise this will get lost in the mailing list (be sure to
share the bug report link with us).

Gili

On 04/09/2014 5:48 PM, Kishore Senji wrote:
> Thank you for your opinions.
>
> The pattern that Robert mentioned is good, but it may be a good
> solution for end user web pages like flight search etc. For the Rest
> services, isn't that the service is a stateless operation which gives
> the output for a given input. That is there is a contract. The
> solution seems to break that principle, meaning that there is an
> intermediate step for the client to interact with.
>
> Because the client has to wait anyway for the expensive operation to
> complete (with async or polling option), it shouldn't matter if the
> connection/socket is held. Also we typically have persistent
> connections between servers (Front end app server talking to a Rest
> service pool will have persistent connections)
>
> I was referring to an use case where a REST service calls multiple
> service calls (or Async IO enabled back-end systems) and
> aggregates/applies some logic on top of those service calls and then
> gives a response back to its clients. In this case having Async
> capability definitely helps as the interface for the client does not
> change (clients makes a rest call and gets a Typed response back) so
> the programming model does not change. For the server because of using
> Async and interacting with other systems using non-blocking IO will
> definitely improve the throughput of the server. (So we can have less
> server instances handling the volume of requests than we would need if
> we did not have Async)
>
> It would be good if the documentation can be corrected otherwise users
> can assume that if they take the processing to another thread then it
> will improve overall throughput.
>
> Thanks,
> Kishore.
>
>
> On Thu, Sep 4, 2014 at 12:58 PM, cowwoc <cowwoc_at_bbs.darktech.org
> <mailto:cowwoc_at_bbs.darktech.org>> wrote:
>
> +1
>
> Sounds like a good way to go.
>
> Gili
>
>
> On 04/09/2014 3:52 PM, Robert DiFalco wrote:
>> I tend not to use the async stuff at all for expensive
>> operations. I want to spin up as many rest server processes as I
>> need to for scalability. So the pattern I use is to return a SEE
>> ALSO redirect to another URL that can be polled until a result
>> shows up. The steps are basically like this:
>>
>> 1. REST request comes in for a long running operation.
>> 2. Create a UUID for the operation's result.
>> 3. Give the client a URL + '/' + UUID to poll for the result.
>> 4. Server dispatches a job for the operation letting the client
>> return immediately
>> 5. When the job is done the result is written to REDIS with the
>> key to the result being the UUID we gave to the client.
>> 6. The polling routine simply polls until either a TIMEOUT is
>> reached or REDIS has a value at the UUID.
>>
>> In this way there is no shared state in memory and you can spin
>> up as many REST servers as you need to. The async stuff in Jersey
>> is not very valuable to me because I don't want to tie up a
>> connection nor require the result to be at the same server as the
>> request.
>>
>> Just FWIW. You're requirements may be different. The above is
>> just a simple pattern I follow for all long live operations
>> (about >1s).
>>
>>
>>
>>
>>
>> On Thu, Sep 4, 2014 at 12:27 PM, Kishore Senji <ksenji_at_gmail.com
>> <mailto:ksenji_at_gmail.com>> wrote:
>>
>> Hi All,
>>
>> The Async example is given at
>> https://jersey.java.net/documentation/latest/async.html
>>
>> "However, in cases where a resource method execution is known
>> to take a long time to compute the result, server-side
>> asynchronous processing model should be used. In this model,
>> the association between a request processing thread and
>> client connection is broken. I/O container that handles
>> incoming request may no longer assume that a client
>> connection can be safely closed when a request processing
>> thread returns. Instead a facility for explicitly suspending,
>> resuming and closing client connections needs to be exposed.
>> Note that the use of server-side asynchronous processing
>> model will not improve the request processing time perceived
>> by the client. *It will however increase the throughput of
>> the server, by releasing the initial request processing
>> thread back to the I/O container while the request may still
>> be waiting in a queue for processing or the processing may
>> still be running on another dedicated thread*. The released
>> I/O container thread can be used to accept and process new
>> incoming request connections."
>>
>> If veryExpensiveOperation() is expensive and is taking long
>> time, then having it run in a different thread and releasing
>> the request processing thread back to the I/O container, how
>> would that improve the throughput?
>>
>> If that is the case we can as well increase the number of
>> request processing threads of the I/O container by the number
>> of worker threads that we would use in the case of the
>> example and not worry about Async at all.
>>
>> We can take more and more connections and have them queue up
>> (or would end up with creating many worker threads), but it
>> would not necessarily increase throughput. It would increase
>> throughput if the veryExpensiveOperation() is doing I/O over
>> a Socket and if we use Async IO for that operation, then we
>> can use minimal request threads and very small worker thread
>> pool to do Async handling of the IO (or combine logic across
>> multiple Service calls doing non-blocking IO, similar to Akka
>> futures). This will improve the throughput as more work is
>> done. But without non-blocking IO, if the
>> veryExpensiveOperation() is either CPU bound or using
>> blocking IO then the worker thread would infact be blocked
>> for that time and we would end up with huge thread pool or a
>> big queue of tasks waiting. Huge thread pool would not scale
>> and big queue would also reduce the throughput.
>>
>> Nevertheless we definitely need a thread to take the
>> processing to a different thread so that the container thread
>> can be returned quickly. But is my understanding correct that
>> it depends on what veryExpensiveOperation() does (blocking or
>> non-blocking IO, or totally CPU bound computation etc) to
>> actually improve the throughput?
>>
>> Thanks,
>> Kishore.
>>
>>
>
>