users@grizzly.java.net

Re: CLOSE_WAIT connections (was: Comet context doesn't expire)

From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
Date: Thu, 13 Aug 2009 16:23:47 +0200

Hi,

is it possible for you to create simple comet application and probably
client, which will reproduce the issue?

Thanks.

WBR,
Alexey.

On Aug 13, 2009, at 15:24 , Jussi Kuosa wrote:

>
> Hello again,
> we have dug out more information about our problems...
>
>>>> comet selector spin problem...
> ...
>>> OK that one is now fixed with grizzly-1.0.30-SNAPSHOT:
> ...
>> We patched our linux and windows servers with 1.0.30. Now our windows
>> cluster
>> and linux single-node system test environment have started to
>> gather TCP
>> connections in CLOSE_WAIT state that are not cleared even though the
>> client
>> processes have gone away ages ago.
> ...
>> They seem to come in batches and irregularly...
>
> We have found out at least one cause for this behavior. Because we
> misunderstood the expiration interval reset on server-side push
> (below), our
> client has a timeout that kills the CONNECT HTTP connection during the
> expiration wait period. The client sends a FIN and gets an ACK from
> the
> server, so the c->s side is closed. For some reason, the server does
> not
> notice that the client has begun to close the connection and the end
> result
> is that on the client the connection waits in FIN_WAIT_2 state and the
> server has a CLOSE_WAIT connection.
>
> My understanding is that onTerminate() should be called when the
> client goes
> away during the sleep period? Am I correct, or does it have to reset
> the TCP
> connection with RST?
>
> After this sequence, the server has:
> # netstat -a | grep 3289
> tcp6 419 0 server:8282 client:3289 CLOSE_WAIT
>
> Notice that there is still data left in the Recv-Q (419 bytes) that
> was not
> copied to the server???
> The server configuration is:
> * single-node GF 2.1-60e
> * JDK 1.6.0_16 (32-bit)
> * 32-bit debian 4
> * VMware VM with single-core ~2.3GHz Xeon with 2GB memory.
>
> The client side has:
>> netstat -a
> TCP server:8282 client:4005 FIN_WAIT_2
>
> I will gladly provide the network capture privately, if needed.
>
>> We were unaware of (2) and presumed that the client expiration delays
>> would
>> not be extended on every push. In addition we do not send the push
>> data to
>> every connected client within a channel. Therefore we have
>> identified a
>> way
>> to push data to a few active clients that causes them reconnect and
>> receive
>> additional push data within the expiration delay. This causes other
>> connected
>> clients to constantly have their expiration delays reset and
>> therefore
>> onInterrupt
>> doesn't get called. Eventually there clients will have a client-side
>> timeout.
>> The situation is cleared once the few clients stop receiving push
>> data on
>> every
>> CONNECT.
>
> Best regards,
>
> Jussi Kuosa
> --
> View this message in context: http://www.nabble.com/Comet-context-doesn%27t-expire-tp24072882p24954380.html
> Sent from the Grizzly - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>