[Jersey] Re: What is the proper way of cleaning up asynchronous requests before servlet redeploy (Jersey, Tomcat)?

From: Bela Bujk <tersyxus_at_gmail.com>
Date: Fri, 1 Jul 2016 16:26:18 +0200

Hi,

thanks for the input. Tomcat would have been a good guess but in the
meantime I've managed to reproduce the connection leak issue *on Jetty
9.3.9 as well*. So the async cleanup problem doesn't seem to be related to
servlet container but the JAX-RS container.

The simple setup I used was the following. I tried it with Jetty and Tomcat
both causing leaks at redeploy.

*A simple JAX-RS resource responding to async requests after 30 seconds.*

import java.util.concurrent.Executors;import
java.util.concurrent.ScheduledExecutorService;import
java.util.concurrent.TimeUnit;
import javax.ws.rs.GET;import javax.ws.rs.Path;import
javax.ws.rs.Produces;import javax.ws.rs.container.AsyncResponse;import
javax.ws.rs.container.Suspended;import javax.ws.rs.core.MediaType;
@Path("/asyncresource")public class AsyncLeakResource {

    private static ScheduledExecutorService scheduledExecutorService =
Executors.newScheduledThreadPool(10);

    @GET
    @Produces(MediaType.APPLICATION_JSON)
    @ManagedAsync
    public void asyncGet(@Suspended final AsyncResponse asyncResponse) {
        final String result = "This is an async response";
        scheduledExecutorService.schedule(new Runnable() {

            @Override
            public void run() {
                asyncResponse.resume(result);
            }
        }, 30, TimeUnit.SECONDS);

        return;
    }}

*A simple long-polling client starting 5 instances. It polls the API even
if the client detects that the request timed out (40 seconds).*

import java.net.SocketTimeoutException;import
java.util.concurrent.ExecutionException;
import javax.ws.rs.client.AsyncInvoker;import
javax.ws.rs.client.Client;import
javax.ws.rs.client.ClientBuilder;import
javax.ws.rs.client.Invocation.Builder;import
javax.ws.rs.client.InvocationCallback;import
javax.ws.rs.client.WebTarget;import javax.ws.rs.core.MediaType;import
javax.ws.rs.core.Response;
import org.glassfish.jersey.client.ClientConfig;import
org.glassfish.jersey.client.ClientProperties;import
org.junit.Before;import org.junit.Test;import org.slf4j.Logger;import
org.slf4j.LoggerFactory;
public class AsyncConnectionLeakClient {

    private static final Logger logger =
LoggerFactory.getLogger(AsyncConnectionLeakClient.class);

    private static final String webappUrl =
"http://localhost:8080/asyncleaktest/";
    private static final String resourcePath = "asyncresource/";

    private Client client;
    private WebTarget target;

    @Before
    public void setup() {
        ClientConfig clientConfig = new ClientConfig();
        clientConfig.property(ClientProperties.READ_TIMEOUT, 40 * 1000);
        client = ClientBuilder.newClient(clientConfig);
        target = client.target(webappUrl).path(resourcePath);
    }

    @Test
    public void test() throws InterruptedException, ExecutionException {
        final Builder requestBuilder =
target.request(MediaType.APPLICATION_JSON);
        final AsyncInvoker asyncInvoker = requestBuilder.async();
        for (int i = 1; i <= 5; i++) {
            LongPollingClient client = new LongPollingClient(asyncInvoker, i);
            client.start();
        }
        Thread.sleep(1000 * 60 * 10);
    }

    private class LongPollingClient {

        private final AsyncInvoker asyncInvoker;
        private int index;

        public LongPollingClient(AsyncInvoker asyncInvoker, int index) {
            this.asyncInvoker = asyncInvoker;
            this.index = index;
        }

        public void start() {
            sendRequest();
        }

        private void sendRequest() {
            logger.info("#{} Sending request", index);
            asyncInvoker.get(new InvocationCallback<Response>() {

                public void failed(Throwable throwable) {
                    if (throwable.getCause() instanceof
SocketTimeoutException) {
                        logger.warn("#{} Timed out. Requesting again", index);
                        sendRequest();
                    } else {
                        logger.error("#{} Error: {}", index, throwable);
                    }
                }

                public void completed(Response response) {
                    logger.info("#{} Response received: {}", index, response);
                    if (response.getStatus() ==
Response.Status.OK.getStatusCode()) {
                        sendRequest();
                    }
                }
            });
        }
    }}

When the client starts the number of active HTTP connections reach a
certain number (container-dependant). JMX metrics to watch for:

   - Tomcat: ThreadPool, HTTP-bio connectionCount
   - Jetty: ConnectorStatistics (should be turned on), connectionsOpen

After each redeploy (i.e. touching war file manually) the number of
connections increases by 5.

Another hint: the connection metrics in JMX corellates with the number
shown by netstat for the webapp port. The number of connections in CLOSE_WAIT
state is increased by 5 at every redeploy.

So should I submit a bug report on the Jersey JIRA?

Cheers,
Bela

On 29 June 2016 at 15:22, Marek Potociar <marek.potociar_at_oracle.com> wrote:

> Hi Bela,
>
> It seems to me that Tomcat should be responsible for releasing any
> lingering connections after an application is undeployed, but please, feel
> free to file a bug against Jersey; we can try to have a look and see if
> there is something more that Jersey can do in this case.
>
> Cheers,
> Marek
>
> On 27 Jun 2016, at 13:45, Bela Bujk <tersyxus_at_gmail.com> wrote:
>
> Hi,
>
> I've submitted a Jersey-related question
> <http://stackoverflow.com/questions/37934558/what-is-the-proper-way-of-cleaning-up-asynchronous-requests-before-servlet-redep>
> on Stackoverflow. I'm posting it here hoping it will get answered by some
> of the experts guys on this mailing list. :)
>
> Will appreciate any hint on this.
>
> ---
>
> I have an asynchronous *JAX-RS* API for long-polling clients put together
> in *Jersey Container Servlet 2.22* and hosted on *Tomcat 7*.
>
> It looks similar to the snippet shown below. It works well in production.
>
> On average 150 long-polling requests are being executed at the same time.
> It results in almost the*same number of live Tomcat HTTP connections* (according
> to JMX metrics). For this low traffic scenario plain-old *HTTP-BIO* connector
> has been used without problems. No runtime connection leak can be detected
> provided you use only managed threads :)
>
> @POST_at_Path("/liveEvents")@ManagedAsyncpublic void getResult(@Suspended final AsyncResponse asyncResponse, RequestPayload payload) {
> asyncResponse.setTimeout(longPollTimeoutMs, TimeUnit.MILLISECONDS);
> asyncResponse.setTimeoutHandler(new TimeoutHandler() {
> @Override
> public void handleTimeout(AsyncResponse asyncResponseArg) {
> try {
> asyncResponseArg.cancel();
> } finally {
> cleanupResources();
> }
> }
> });
> startListeningForExternalEventsAndReturn(payload);}
> private void startListeningForExternalEventsAndReturn(RequestPayload payload) {
> externalResource.register(new Listener() {
> @Override
> public void onEvent(Event event) {
> respond(event);
> }
> });}
> private void respond(Event event) {
> try {
> asyncResponse.resume(event);
> } catch (RuntimeException exception) {
> asyncResponse.resume(exception);
> } finally {
> cleanupResources();
> }}
>
> The problem I'm facing is that after a successful *Tomcat* redeploy
> process the number of live connections will apparently increase to about
> 300 then to 450 and after some further redeploys it will hit the
> maxConnection limit configured for the container.
>
> The clients of the API handle the redeploy by waiting for a client-side
> timeout (which is of course bigger than the one set at servlet-side) and
> start polling the API again. But they are guaranteed to send only one
> request at the same time.
>
> The shape of the monitoring graph on connection count gives a hint.
> Connection count remains constant after undeployment (connections are not
> released back to the pool even by TimeOutHandler) and starts to increase
> (allocate new connections) as clients start long-polling again. *In fact,
> ongoing (suspended) asnyc requests started in the previous context are
> never relesed until JVM termination!*
>
> After some digging around it's not difficult to find out by analyzing heap
> dumps made after few redeployments that unreleased, suspended
> AsyncResponse (AsyncResponder) instances remain in the memory from
> previous web application contexts (easily filterable by JQL queries grouped
> by Classloader instances). It's very suspicious too that the same number
> of out-dated org.apache.coyote.Request instances are present in the
> memory from previous contexts.
>
> I started to look around the undeployment-related source code of the *Jersey
> Container* hoping that some graceful shutdown process is implemented for
> async requests with some cleanup actions executed at @PreDestroy-time or
> in close() or dispose() methods of Providers.
>
> I had an optimistic guess that by running each scheduled TimeOutHandlers
> right before undeployment would solve the problem. But replacing the
> default @BackgroundScheduler provider (DefaultBackgroundSchedulerProvider)
> to a custom implementation and collecting all queued TimeoutHandlers of
> the Executor and eventually invoking AsyncResponse.resume() or
> AsyncResponse.cancel() on them did not help. This stage might be too late
> for this cleaning up because request-scope is already shut down.
>
> Any ideas on what my async setup is missing or how *Jersey* can be
> configured to release the *Servlet Container*'s connections that are
> still suspended at redeploy-time?
>
>
>