Re: TCPSelectorHandler$1 leaking

From: Jeanfrancois Arcand <Jeanfrancois.Arcand_at_Sun.COM>
Date: Fri, 26 Jun 2009 17:33:15 -0400

Scott Oaks wrote:
> On 06/26/09 15:29, Jeanfrancois Arcand wrote:
>> Salut,
>>
>> Scott Oaks wrote:
>>> On 06/26/09 10:26, Jeanfrancois Arcand wrote:
>>>> Salut,
>>>>
>>>> Oleksiy Stashok wrote:
>>>>> Hi,
>>>>>
>>>>>>>>> Hi Scott,
>>>>>>>>>> Sure thing, but I'm slightly confused by the wording -- do I
>>>>>>>>>> need to get the new source and build, or is the latest
>>>>>>>>>> snapshot download already built from the new source?
>>>>>>>>> To be sure, I'll prefer to try from sources, because snapshots
>>>>>>>>> could be produced with delays, and hudson is not so stable.
>>>>>>>>
>>>>>>>> Alexey, I saw your commit...that will for sure fix the issue,
>>>>>>>> but we also need to find why we leak so bad when the pending I/O
>>>>>>>> is executed by the thread pool. I suspect this could be related
>>>>>>>> to our thread-count number (and all the issue we are observing
>>>>>>>> right now with Executors ;-))
>>>>>>> I'm even not sure there is some leak, because during stress test
>>>>>>> we may load Thread-pool so hard, that it executes pending tasks
>>>>>>> slower, than we add them. So the number of tasks continuesly grows.
>>>>>>
>>>>>> Well Scott measured 8 millions on TCPSelectorHandler$1
>>>>>> instance....I serioulsy thinks this is a major leak. It is not
>>>>>> normal to see all those instance IMO.
>>>>>>
>>>>>> We need to rethink about the thread pool....we see too many
>>>>>> regressions right now. Will start a new thread.
>>>>> We can use kernel thread pool to execute pending tasks.
>>>>
>>>> That would make sense as this use a CachedThreadPool (quite scary
>>>> still). But this is dangerous IMO. We need to find why 8 millions of
>>>> that class were there at the first place. Working with Scott....
>>>
>>> I took the sources from earlier this morning and built the 1.9.17
>>> SNAPSHOT, and with that, I don't see the problem with 8 million of
>>> the TCPSelectorHandler$1 getting created -- there aren't really major
>>> GC issues at all any more. Still a regression from V2, but that's
>>> another story...
>>>
>>> My understanding is that Alexey backed out his proposed fix
>>> yesterday, so presumably something else has fix this? Or maybe
>>> there's something I've yet to discover.
>>
>> What Alexey did is to turn off using a dedicated Thread to close I/O
>> operation...so we were back to what we always used v2/v3. I'm about to
>> commit a fix that will allow configuring the behavior using System
>> property. I will let the current mechanism turned off, but it would be
>> nice (once you have a chance) to test using:
>>
>>
>> -Dcom.sun.grizzly.finishIOUsingCurrentThread=false
>>
>> which will turn on the mechanism that created 8 millions of runnable.
>
> I have tested with that flag set to true and false, and I still am
> unable to reproduce the error. I can only conclude at this point that
> some other change between 1.9.15a and now has fixed the issue and that
> the pending tasks aren't being blocked anymore so they don't build up.

I think we have fixed the issue by using the Controller internal thread
pool (called kernel thread pool). At least this is what I've committed
earlier today :-)

>
> I could spend some time seeing just what...but what are the plans to
> integrate something new into glassfish? If glassfish will move to 1.9.17
> or later soon, maybe it's not worth it.

Yes we have:

* http://is.gd/1ayBU

There is one MQ related that we want to cleat before cutting 1.9.17. So
I think next week we should have 1.9.17 integrated in v3.

-- Jeanfrancis

>
> -Scott
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: dev-help_at_grizzly.dev.java.net
>