dev@grizzly.java.net

Re: closed connection under high load

From: Mkrtchyan, Tigran <tigran.mkrtchyan_at_desy.de>
Date: Fri, 7 Mar 2014 16:36:55 +0100 (CET)

Hi Alexey,

I got my test nodes up and ran some tests.
As you suggested, I added ConnectionProbe.
Additionally, I use SameThreadStrategy and
limited number of threads to two. This
makes effect to see without long runs.

It looks like that IO dominated by a single connection.
I monitor three probes: onWriteEvent, onReadEvent and
onIOEventReadyEvent. While I see ACCEPT from all clients,
READ and WRITE comes only from a single client.

Tigran.

----- Original Message -----
> From: "Oleksiy Stashok" <oleksiy.stashok_at_oracle.com>
> To: dev_at_grizzly.java.net
> Sent: Friday, March 7, 2014 2:39:22 AM
> Subject: Re: closed connection under high load
>
>
> On 06.03.14 01:29, Mkrtchyan, Tigran wrote:
> >
> > ----- Original Message -----
> >> From: "Oleksiy Stashok" <oleksiy.stashok_at_oracle.com>
> >> To: dev_at_grizzly.java.net
> >> Sent: Thursday, March 6, 2014 2:35:00 AM
> >> Subject: Re: closed connection under high load
> >>
> >> Hi Tigran,
> >>
> >>>> Hi Tigran,
> >>>>
> >>>> looks like the connection is getting closed because of IOException or -1
> >>>> returned during channel.read(), which is expected.
> >>>> So IMO it's nothing decided by Grizzly, it's either other side or local
> >>>> OS closed the connection.
> >>> Yes, in did, the client closes the connection. But it looks like the
> >>> reason
> >>> of it
> >>> that reply from server sent 15sec after request is arrived. The filter
> >>> itself took
> >>> max 800ms. Any ideas how to trace where rest of the time was spent?
> >> You can register ConnectionProbe for each Connection:
> >> / connection.getMonitoringConfig().addProbes(new
> >> ConnectionProbe() {
> >> .....
> >> });/
> >>
> >> and monitor different Connection events and time when it happened.
> >> It might be interesting to monitor onWriteEvent() and see if write delay
> >> occurs on Grizzly/Java or OS level. For example Grizzly reports the
> >> socketChannel.write(ByteBuffer) had been accepted the ByteBuffer, but
> >> the peer didn't get this data, which means the delay happens in OS layer.
> > I was logging in my code where I start to process the request and this
> > was with quite a delay. Unfortunately, my work load nodes are turned
> > off till the end of the week as we update cooling infrastructure in our
> > data center. But I will continue investigations next week. Is there any
> > best practice in which relations should be the number of selector and
> > worker threads?
> By default we'd suggest selector-threads-count=CPU-count and
> worker-threads-count=CPU-count*2
> I'd also recommend to use SameThreadStrategy (no worker threads), where
> it's possible. For example if the app logic doesn't have any blocking,
> waiting kind of code - there is no reason to pay for a thread context
> switch.
>
> >
> > BTW, what do you think about proposed change:
> > https://github.com/GrizzlyNIO/grizzly-mirror/pull/1
> Looks good.
> Will apply it.
>
> Thank you!
>
> WBR,
> Alexey.
>
>
> >
> > Tigran.
> >
> >> Thanks.
> >>
> >> WBR,
> >> Alexey.
> >>
> >>> Looking at the requests log I can see that request processing is not
> >>> equally distributed.
> >>> Some connections processed quite active, while others are not. Is there a
> >>> way how I can see
> >>> selector threads activity?
> >>>
> >>> Tigran.
> >>>
> >>>> Thanks.
> >>>>
> >>>> WBR,
> >>>> Alexey.
> >>>>
> >>>> On 26.02.14 14:06, Mkrtchyan, Tigran wrote:
> >>>>> Hi,
> >>>>>
> >>>>> we observe that grizzly decides to close connection under high load.
> >>>>> Any idea how to find the reason?
> >>>>>
> >>>>> Thanks,
> >>>>> Tigran.
> >>>>>
> >>>>> java.io.IOException: Connection closed
> >>>>> at
> >>>>> org.glassfish.grizzly.asyncqueue.TaskQueue.onClose(TaskQueue.java:302)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.onClose(AbstractNIOAsyncQueueWriter.java:591)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.transport.TCPNIOTransport.closeConnection(TCPNIOTransport.java:389)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.NIOConnection$2.run(NIOConnection.java:439)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler$RunnableTask.run(DefaultSelectorHandler.java:489)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.processPendingTaskQueue(DefaultSelectorHandler.java:295)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.processPendingTasks(DefaultSelectorHandler.java:286)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.preSelect(DefaultSelectorHandler.java:100)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:326)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:271)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> >>>>> Caused by: java.lang.Exception: Close stack trace
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.NIOConnection.close0(NIOConnection.java:426)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.transport.TCPNIOConnection.close0(TCPNIOConnection.java:292)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.transport.TCPNIOTransport.read(TCPNIOTransport.java:648)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.transport.TCPNIOTransportFilter.handleRead(TCPNIOTransportFilter.java:75)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.TransportFilter.handleRead(TransportFilter.java:173)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:288)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:206)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:136)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:114)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:546)
> >>>>> [grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:113)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:115)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:55)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>>>> at
> >>>>> org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:135)
> >>>>> ~[grizzly-framework-2.3.5.jar:2.3.5]
> >>
>
>