users@grizzly.java.net

Deadlock when handling onFailedToConnect to connection pool

From: Fabian Gonzalez <fabian.gonzalez_at_mulesoft.com>
Date: Fri, 10 Feb 2017 11:59:30 -0300

Hello,

This is my first mail to the list, so greetings to all.
I'm facing a problem concerning the handling of a connection failure in a
connection pool that results in a deadlock.
The problem arises in the following situation:

1) thread1 and thread2 try to retrieve a connection to endpoint1 and
endpoint2 respectively through SingleEndpointPool.

2) both threads fail to connect so the method onFailedToConnect of
ConnectCompletionHandler is invoked.

3) this generate a synchronized section which is locked based on the
endpoint1 and endpoint2 respectively.

4) Within the synchronized piece of code the handlers of failure to connect
are recursively invoked. In the handling managed by the app which thread1
tries to reconnect to endpoint2 and thread1 to endpoint2.

5) When SingleEndpointPool is taken (method take) a new synchronized area
is generated. But both threads get hang because thread1 is waiting for the
lock of endpoint2 to be freed (taken by thread2 in onFailedToConnect) and
thread2 is waiting for endpoint1 to be freed (taken by thread1 in
onFailedToConnect).

This situation is generated in the scenario where several addresses are
resolved through dns for the domain and they are not returned in the same
order so when the connect failure arises a reconnection is attempted using
the following ip in the ip list resolved by the dns.

I copy a part of the thread dump:

this is the stacktrace wich results in the deadlock:

"[na-stg-mcd].http.requester.GatewayRequestor(4) SelectorRunner":
* at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.take(SingleEndpointPool.java:755)*
* - waiting to lock <0x000000068d7c7348> (a java.lang.Object)*
* at
org.glassfish.grizzly.connectionpool.MultiEndpointPool.take(MultiEndpointPool.java:592)*
at
com.ning.http.client.providers.grizzly.ConnectionManager.openAsync(ConnectionManager.java:143)
at
com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider.execute(GrizzlyAsyncHttpProvider.java:174)
at
com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
at
org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient.send(GrizzlyHttpClient.java:295)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient.access$1(EEGrizzlyHttpClient.java:1)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient$1.onFailure(EEGrizzlyHttpClient.java:150)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient$1.onFailure(EEGrizzlyHttpClient.java:1)
at
org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient$WorkManagerSourceAsyncCompletionHandler.onThrowable(GrizzlyHttpClient.java:327)
at
com.ning.http.client.providers.grizzly.GrizzlyResponseFuture.failed(GrizzlyResponseFuture.java:159)
at
org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:187)
at org.glassfish.grizzly.impl.SafeFutureImpl.done(SafeFutureImpl.java:277)
at
org.glassfish.grizzly.impl.SafeFutureImpl$Sync.innerSetException(SafeFutureImpl.java:382)
at
org.glassfish.grizzly.impl.SafeFutureImpl.failure(SafeFutureImpl.java:122)
at
com.ning.http.client.providers.grizzly.GrizzlyResponseFuture.abort(GrizzlyResponseFuture.java:72)
at
com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$1.failed(GrizzlyAsyncHttpProvider.java:138)
at org.glassfish.grizzly.utils.Futures.notifyFailure(Futures.java:129)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.notifyAsyncPollersOfFailure(SingleEndpointPool.java:1181)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.access$1400(SingleEndpointPool.java:101)
at
*org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.onFailedToConnect(SingleEndpointPool.java:1313)*
* - locked <0x000000068d6d6770> (a java.lang.Object)*
* at
org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.failed(SingleEndpointPool.java:1289)*
at
org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:187)
at org.glassfish.grizzly.impl.SafeFutureImpl.done(SafeFutureImpl.java:277)
at
org.glassfish.grizzly.impl.SafeFutureImpl$Sync.innerSetException(SafeFutureImpl.java:382)
at
org.glassfish.grizzly.impl.SafeFutureImpl.failure(SafeFutureImpl.java:122)
at
org.glassfish.grizzly.utils.CompletionHandlerAdapter.failed(CompletionHandlerAdapter.java:104)
at
org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler.abortConnection(TCPNIOConnectorHandler.java:268)
at
org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler.onConnectedAsync(TCPNIOConnectorHandler.java:221)
at
org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler$1.connected(TCPNIOConnectorHandler.java:158)
at
org.glassfish.grizzly.nio.transport.TCPNIOConnection.onConnect(TCPNIOConnection.java:258)
at
org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:517)
at
org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)
at
org.mule.module.http.internal.request.grizzly.FlowWorkManagerIOStrategy.run0(FlowWorkManagerIOStrategy.java:134)
at
org.mule.module.http.internal.request.grizzly.FlowWorkManagerIOStrategy.executeIoEvent(FlowWorkManagerIOStrategy.java:77)
at
org.glassfish.grizzly.strategies.AbstractIOStrategy.executeIoEvent(AbstractIOStrategy.java:89)
at
org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:415)
at
org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:384)
at
org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:348)
at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:279)
at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:591)
at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:571)
at java.lang.Thread.run(Thread.java:745)
"[na-stg-mcd].http.requester.GatewayRequestor(1) SelectorRunner":
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.take(SingleEndpointPool.java:755)
- waiting to lock <0x000000068d6d6770> (a java.lang.Object)
at
org.glassfish.grizzly.connectionpool.MultiEndpointPool.take(MultiEndpointPool.java:592)
at
com.ning.http.client.providers.grizzly.ConnectionManager.openAsync(ConnectionManager.java:143)
at
com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider.execute(GrizzlyAsyncHttpProvider.java:174)
at
com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
at
org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient.send(GrizzlyHttpClient.java:295)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient.access$1(EEGrizzlyHttpClient.java:1)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient$1.onFailure(EEGrizzlyHttpClient.java:150)
at
com.mulesoft.mule.http.request.grizzly.EEGrizzlyHttpClient$1.onFailure(EEGrizzlyHttpClient.java:1)
at
org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient$WorkManagerSourceAsyncCompletionHandler.onThrowable(GrizzlyHttpClient.java:327)
at
com.ning.http.client.providers.grizzly.GrizzlyResponseFuture.failed(GrizzlyResponseFuture.java:159)
at
org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:187)
at org.glassfish.grizzly.impl.SafeFutureImpl.done(SafeFutureImpl.java:277)
at
org.glassfish.grizzly.impl.SafeFutureImpl$Sync.innerSetException(SafeFutureImpl.java:382)
at
org.glassfish.grizzly.impl.SafeFutureImpl.failure(SafeFutureImpl.java:122)
at
com.ning.http.client.providers.grizzly.GrizzlyResponseFuture.abort(GrizzlyResponseFuture.java:72)
at
com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$1.failed(GrizzlyAsyncHttpProvider.java:138)
at org.glassfish.grizzly.utils.Futures.notifyFailure(Futures.java:129)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.notifyAsyncPollersOfFailure(SingleEndpointPool.java:1181)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool.access$1400(SingleEndpointPool.java:101)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.onFailedToConnect(SingleEndpointPool.java:1313)
- locked <0x000000068d7c7348> (a java.lang.Object)
at
org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.failed(SingleEndpointPool.java:1289)
at
org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:187)

Found one Java-level deadlock:
=============================
"[na-stg-mcd].https-lc-0.0.0.0-8082.worker.9480":
  waiting to lock monitor 0x00007fc330286fe8 (object 0x000000068d6d6770, a
java.lang.Object),
  which is held by "[na-stg-mcd].http.requester.GatewayRequestor(4)
SelectorRunner"
"[na-stg-mcd].http.requester.GatewayRequestor(4) SelectorRunner":
  waiting to lock monitor 0x00007fc3323b0e68 (object 0x000000068d7c7348, a
java.lang.Object),
  which is held by "[na-stg-mcd].http.requester.GatewayRequestor(1)
SelectorRunner"
"[na-stg-mcd].http.requester.GatewayRequestor(1) SelectorRunner":
  waiting to lock monitor 0x00007fc330286fe8 (object 0x000000068d6d6770, a
java.lang.Object),
  which is held by "[na-stg-mcd].http.requester.GatewayRequestor(4)
SelectorRunner"