users@glassfish.java.net

Problem with webstart client when one node in cluster is down

From: <glassfish_at_javadesktop.org>
Date: Thu, 29 Mar 2007 15:07:25 PST

I have a cluster with 2 nodes. The DAS runs on the same machine as node 1. Node 2 is on a different machine. I deployed an application to the cluster. I can launch the client with the webstart link from either node in the cluster and from the javaws cache viewer and it all works fine. When I take down node 2 the client will no longer start. It prints an exception to the console (which becomes unresponsive), spins for about a minute eating up all of the CPU on the client machine, and eventually pops up an error dialog. If I take down node 1 instead of node 2 then everything works. I can launch the client that originally came from either node and they will both find node 2. In the jnlp file the property "com.sun.aas.jws.iiop.defaultHost" points to the node from which I originally launched the client but this value doesn't seem to matter. The property "com.sun.aas.jws.iiop.failover.endpoints" lists node 2 first followed by node 1 regardless of which server I launched from. It seems that when the first node in the list is down a webstart client can not access the cluster.

Further testing shows that the client always connects to node 2, even when both nodes are running. I am closing and re-launching the client each time but it never changes nodes regardless of the defaultHost property. I gave each node a weight of 100 when I set up the cluster.

Has anyone else tried something like this? Do I have something configured wrong? I am using glassfish v2 build 40 on 32 bit linux with java version 1.5u10. My client is running on a mac using the apple java port version 1.5u7.
[Message sent by forum member 'sarnoth' (sarnoth)]

http://forums.java.net/jive/thread.jspa?messageID=210529