users@glassfish.java.net

10 Minute Instance Startups after adding DJMXCONNECTOR_TIMEOUT_MILLISEC

From: <glassfish_at_javadesktop.org>
Date: Tue, 23 Mar 2010 06:43:44 PDT

System Configuration: Sun Microsystems sun4v Sun Blade T6320 Server Module
Memory size: 2048 Megabytes
Sun Java System Application Server Enterprise Edition 8.2 (build b42-p09)

In a recent issue after patching to b42-p09 which caused synchronization between node and domain to fail, Sun recommended that we add <property name="INSTANCE-SYNC-JVM-OPTIONS" value="-DJMXCONNECTOR_TIMEOUT_MILLISEC=30000"/> to our nodeagent configuration. This corrected the problem, but it also caused instance startups to take 10+ minutes with multiple errors being shown in the server.log for domain and Instance:

Domain.xml
[#|2010-03-19T16:08:26.111-0400|WARNING|sun-appserver-ee8.2|javax.management.remote.misc|_ThreadID=15;|Failed to restart: java.io.IOException: Failed to get a RMI stub: javax.naming.NameNotFoundException: management/rmi-jmx-connector|#]

[#|2010-03-19T16:08:33.305-0400|WARNING|sun-appserver-ee8.2|javax.enterprise.system.stream.err|_ThreadID=16;|java.rmi.ConnectException: Connection refused to
 host: localhost; nested exception is:
        java.net.ConnectException: Connection refused
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:574)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:94)
        at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
        at javax.management.remote.rmi.RMIConnectionImpl_Stub.getDefaultDomain(Unknown Source)
        at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getDefaultDomain(RMIConnector.java:997)
        at com.sun.enterprise.ee.admin.clientreg.JMXConnectorRegistry$CVTask.run(JMXConnectorRegistry.java:479)
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:520)
        at java.net.Socket.connect(Socket.java:470)
        at java.net.Socket.<init>(Socket.java:367)
        at java.net.Socket.<init>(Socket.java:180)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:569)
        ... 7 more
|#]

The instance continues to start up at this point but it takes 10+ minutes to get here.

If I do a netstat during synchronization I see tons of TIME_WAIT entries with source and destination of localhost, so I have a feeling the node is trying to connect to the RMI/JMX connector on the domain which isn't started properly.

I have tried enabling/disabling SSL between node/domain and tried localhost, 127.0.0.1 and the IP of the server in my nodeagent configuration for the domain host. All had the same outcome.

Any ideas?
[Message sent by forum member 'jvermast']

http://forums.java.net/jive/thread.jspa?messageID=393311