users@glassfish.java.net

Failover doesn't work while a cluster instance is in stopping mode

From: <glassfish_at_javadesktop.org>
Date: Wed, 19 Dec 2007 10:04:25 PST

Hi,

we've tested the cluster failover and found an problem in case while the server instance shutdown is in progress. We have a cluster with two instances on two different maschines. A simple client calls an EJB method every second in a loop. We do debug printouts on server and client. After starting the client, we searching the server instance which actually receives the call. Now we shut down this server instance with the GlassFish admin web console. In the client log we receive the following exception:

java.rmi.ServerException: RemoteException occurred in server thread; nested exception is:
        java.rmi.RemoteException: Attempt to invoke when container is in STOPPED

When the server is completely down, than the client reconnects to the other server instance of the cluster and from now the calls are ok. But we loose some calls - in fact all calls wich are made while the server shutdown is in progress.

In the document "ORBD Architecture for S1AS8/EE"
https://glassfish-corba.dev.java.net/design/orbdArchitecture.html
in section 5.3 (Impact of Orderly Shutdown) we found the description for a different behavior.

This document states:
When a ServerInstance is going down it also shuts down the ORB with
ORB.shutdown(true). This should block new incoming calls
(which then receive a COMPLETE_NO to do a correct failover)
and all ongoing call should be well done without any
exception. After the last ongoing call ends, the ORB shuts
really down.

When we hard kill the GlassFish process from the operating system (while the client is waiting), then the failover works correctly without any failed call.

It is possible to influence this behavior or it is maybe a bug in the GlassFish's CORBA implementation? It seems that GlassFish does not convert the java.rmi.ServerException for the reason "server is in STOPPED" to an CORBA exception with COMPLETE_NO flag. So the client's ORB has no chance to initiate a failover.

Thanks,
 Frank
[Message sent by forum member 'fmeili' (fmeili)]

http://forums.java.net/jive/thread.jspa?messageID=250808