dev@grizzly.java.net

Re: TCPSelectorHandler$1 leaking

From: Scott Oaks <Scott.Oaks_at_Sun.COM>
Date: Fri, 26 Jun 2009 15:25:09 -0400

On 06/26/09 10:26, Jeanfrancois Arcand wrote:
> Salut,
>
> Oleksiy Stashok wrote:
>> Hi,
>>
>>>>>> Hi Scott,
>>>>>>> Sure thing, but I'm slightly confused by the wording -- do I need
>>>>>>> to get the new source and build, or is the latest snapshot
>>>>>>> download already built from the new source?
>>>>>> To be sure, I'll prefer to try from sources, because snapshots
>>>>>> could be produced with delays, and hudson is not so stable.
>>>>>
>>>>> Alexey, I saw your commit...that will for sure fix the issue, but
>>>>> we also need to find why we leak so bad when the pending I/O is
>>>>> executed by the thread pool. I suspect this could be related to our
>>>>> thread-count number (and all the issue we are observing right now
>>>>> with Executors ;-))
>>>> I'm even not sure there is some leak, because during stress test we
>>>> may load Thread-pool so hard, that it executes pending tasks slower,
>>>> than we add them. So the number of tasks continuesly grows.
>>>
>>> Well Scott measured 8 millions on TCPSelectorHandler$1 instance....I
>>> serioulsy thinks this is a major leak. It is not normal to see all
>>> those instance IMO.
>>>
>>> We need to rethink about the thread pool....we see too many
>>> regressions right now. Will start a new thread.
>> We can use kernel thread pool to execute pending tasks.
>
> That would make sense as this use a CachedThreadPool (quite scary
> still). But this is dangerous IMO. We need to find why 8 millions of
> that class were there at the first place. Working with Scott....

I took the sources from earlier this morning and built the 1.9.17
SNAPSHOT, and with that, I don't see the problem with 8 million of the
TCPSelectorHandler$1 getting created -- there aren't really major GC
issues at all any more. Still a regression from V2, but that's another
story...

My understanding is that Alexey backed out his proposed fix yesterday,
so presumably something else has fix this? Or maybe there's something
I've yet to discover.

-Scott