Re: Server stops responding due to Glassfish

From: Jeanfrancois Arcand <Jeanfrancois.Arcand_at_Sun.COM>
Date: Tue, 29 Apr 2008 21:00:01 -0400

Hi Ryan,

thanks for the info...so far, the exception are expected. They just
means the client closed the connection before the server has a chance to
write a response:

> Caused by: ClientAbortException: java.nio.channels.ClosedChannelException
> at org.apache.coyote.tomcat5.OutputBuffer.realWriteBytes(OutputBuffer.java:409)
> at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:417)
> at org.apache.coyote.tomcat5.OutputBuffer.doFlush(OutputBuffer.java:357)
> at org.apache.coyote.tomcat5.OutputBuffer.flush(OutputBuffer.java:335)
> at org.apache.coyote.tomcat5.CoyoteResponse.flushBuffer(CoyoteResponse.java:638)
> at org.apache.coyote.tomcat5.CoyoteResponseFacade.flushBuffer(CoyoteResponseFacade.java:291)
> at com.sun.faces.application.ViewHandlerImpl.renderView(ViewHandlerImpl.java:203)
> at com.sun.rave.web.ui.appbase.faces.ViewHandlerImpl.renderView(ViewHandlerImpl.java:320)
> at com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:106)
> ... 34 more
> Caused by: java.nio.channels.ClosedChannelException
> at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
> at com.sun.enterprise.web.connector.grizzly.OutputWriter.flushChannel(OutputWriter.java:94)
> at com.sun.enterprise.web.connector.grizzly.OutputWriter.flushChannel(OutputWriter.java:67)
> at com.sun.enterprise.web.connector.grizzly.SocketChannelOutputBuffer.flushChannel(SocketChannelOutputBuffer.java:167)
> at com.sun.enterprise.web.connector.grizzly.SocketChannelOutputBuffer.flushBuffer(SocketChannelOutputBuffer.java:202)
> at com.sun.enterprise.web.connector.grizzly.SocketChannelOutputBuffer.flush(SocketChannelOutputBuffer.java:178)
> at com.sun.enterprise.web.connector.grizzly.SocketChannelOutputBuffer.realWriteBytes(SocketChannelOutputBuffer.java:145)
> at org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:851)
> at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:149)
> at org.apache.coyote.http11.InternalOutputBuffer.doWrite(InternalOutputBuffer.java:626)
> at org.apache.coyote.Response.doWrite(Response.java:599)
> at org.apache.coyote.tomcat5.OutputBuffer.realWriteBytes(OutputBuffer.java:404)
> ... 42 more
> |#]

This exception isn't the cause of the hangs. Can you try something? Can
you get a jstack every 2 hours to see how it goes? Also the nbpool
problem will happens faster if more http requests are made. I'm able to
reproduce it quite fast with a load of 300 users doing requests every
seconds...its takes less than 2 hours to break win32 :-)

Thanks

-- Jeanfrancois

Ryan de Laplante wrote:
> It went down again today! 5.5 hours since it went down last. This is a
> new record. It also has a comparatively low NP Pool count of 383K (I've
> seen it up to 2200K before), and is using only 304,504K memory. I
> forgot to try tweaking a setting in HTTP listener to see if it comes
> back to life or not. I did try to do a stack dump:
>
> > jstack 5180
> 5180: Not enough storage is available to process this command
>
> Then I tried using this tool to get a stack dump:
>
> http://www.adaptj.com/main/download
>
> 5180 java.exe session:0 threads:131 parent:5744
> The current version does not support processes running in a different
> session.
> Try any of the following options:
> 1) Run the StackTrace service in the same session with the target process.
> 2) Start the terminal client with "mstsc.exe /console"
> 3) Use VNC from http://www.realvnc.com/ as a remote client.
>
> Attached are some Grizzly and NIO channels related exceptions from
> server.log
>
> We've had to write a program that checks the server every 10 minutes and
> email us when it goes down. We're also now going to restart GlassFish
> three times a week. Based on the discussions on this mailing list today
> about linux users having these same problems, we are no longer convinced
> that it can be blamed on the Windows 2003 NP Pool leak. Yes there is a
> leak, but I think GlassFish has a serious problem too. We did not have
> this problem with JBoss on the same server and OS a year ago.
>
> Hopefully Sun will put more resources into this issue immediately. It
> is the only issue we've had to use our support contract for, and we seem
> to be getting nowhere with it after 6 months. My employer is not
> satisfied and I'm wondering if he will renew the contract, or switch app
> server vendors. This is a production server and it goes down all the time.
>
>
> Ryan
>
>
> Ryan de Laplante wrote:
>> glassfish_at_javadesktop.org wrote:
>>>> HTTP requests consistently stop reaching the web application
>>>>
>>>
>>> Detect same on my server (linux), but not consistently and very rarely.
>>> In that cases non of my webapplications are reachable, also admin gui.
>>> Nothing to see in log files.
>>>
>>> Think this must be an "unnormal" issue.
>>> Not familiar with that stuff, just guessing: could it be an problem
>>> with broken connections, mean if client/user aborts
>>> [Message sent by forum member 'hammoud' (hammoud)]
>>>
>>> http://forums.java.net/jive/thread.jspa?messageID=272085
>>>
>> This is concerning. Up until now I thought this problem was specific
>> to Windows 2003 NP Pool leak. That might explain why I experience two
>> similar but different issues:
>>
>> 1) Every week or two the web container would stop serving requests.
>> Sometimes it would say "Maximum connections reached: 4096" even when
>> there were only a couple of hundred transactions a day. Other times
>> it would show nothing in the browser or not respond at all. My other
>> http listener for web services also stops working. Usually the web
>> admin console is the only http listener that is working.
>> Restarting the SJSAS 9.1 Windows service solves the problem.
>>
>> 2) Every few months I find that restarting SJSAS 9.1 Windows service
>> makes no difference. PostgreSQL also dies and you can't connect to
>> it anymore. The only solution is to reboot Windows.
>>
>> I think issue #2 is related to the Windows 2003 Server NP Pool leak
>> which may have been fixed now with Microsoft patches, but I doubt it
>> since we have to restart SJSAS 9.1 more often since installing the
>> patches. I think issue #1 is a GlassFish problem, since you experience
>> it on linux and so does an other poster in this forum.
>>
>>
>> Ryan
>
>
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net