users@grizzly.java.net

Re: SelectorRunner uses 100% CPU

From: Oleksiy Stashok <oleksiy.stashok_at_oracle.com>
Date: Tue, 22 Sep 2015 17:38:02 -0700

Hi Marc,

I just backported [1] and [2] to 2.2.x branch and published
2.2.23-SNAPSHOT, can you pls try it?
You can enable a system property as mentioned in [2] and see if it helps
to prevent 100% CPU consumption, but it could be also a consequence of OOM.
Once you hit OOM, could you take a memory dump, so we can investigate if
it's a leak, or we just hit machine limit.

Thanks.

WBR,
Alexey.

[1] https://java.net/jira/browse/GRIZZLY-1712
[2] https://java.net/jira/browse/GRIZZLY-1801



On 22.09.15 09:58, Marc Arens wrote:
> Hey Ryan,
>
> thanks for the fast reply. Complete system outage(OOM not able to create new
> threads) happens ~15 minutes after server start in the worst case. But i can't
> even say when exactly the SelectorRunner starts using so much CPU within that
> time frame. I'd be very thankful for backports of those fixes!
>
>> On 22 September 2015 at 18:21 Ryan Lubke <ryan.lubke_at_oracle.com> wrote:
>>
>>
>> It's hard to say if you're hitting the issue or not based on the
>> provided information.
>>
>> We can try back-porting the fix as well as changes to force selector
>> spin detection.
>>
>> How often is this occurring?
>>
>> Thanks,
>> -rl
>>
>>> Marc Arens <mailto:marc.arens_at_open-xchange.com>
>>> September 22, 2015 at 10:51
>>> Moin,
>>>
>>> we are seeing system outages at one deployment where one repeating
>>> point of the
>>> outages is a
>>>
>>> "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
>>> nid=0x23e6
>>> runnable [0x00007fec6e3aa000]
>>> java.lang.Thread.State: RUNNABLE
>>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>>> - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
>>> - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
>>> - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
>>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>>> at
>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
>>> at
>>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
>>> at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
>>> at
>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
>>> at
>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> which keeps a CPU busy at 100% for at least 15 seconds and probably
>>> even longer
>>> but those 15s is all our debug data covers at the moment.
>>>
>>> This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
>>> Environment
>>> (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
>>> 24.79-b02,
>>> mixed mode).
>>>
>>> I found a 2.3.x related bug at
>>> https://java.net/jira/browse/GRIZZLY-1712 could
>>> it be this one we are hitting although it's 2.2.x? If yes, would it be
>>> possible
>>> to backport it to 2.2.x? Do you know about any other issues we might
>>> be hitting
>>> here?
>>>
>>> Best regards
>>> Marc
>>>
>>> --
>>> ...it was evening and it was morning and there were already two ways
>>> to store
>>> Unicode...