Automatic Delegated Recovery Tests for XA TX with JDBC and JMS

Updated on Nov. 16, 2010 by Sherry.Hill@oracle.com

I. Test Case
II. Test Steps
III. Bug
IV.  Related server.log

I. Test Case
appserver-sqe/pe/transaction/recovery/cliapp3

II. Test Steps
1) set up sqe-cluster of 3 instances on 1 machine with delegated-recovery
2) use conventional broker clusters (service HA) with LOCAL mode.
3) setup jdbc and jms resource and deploy app (servlet, session bean and mdb).
4) app does xa transaction from a session bean with CMT
i.e. 2 PC to 2 databases and 1 message queue from one glassfish instance, e.g. instance2
--set FailureInducer.setWaitPoint(PREPARED)
--insert 3 rows to DB_A in table student
--insert 3 rows to DB_B in table student
--send 3 messages to Queue1,

4a) kill glassfish instance and check recovery
--During WaitPoint, kill instance2,
   (Checked that the broker 2 has 3 PREPARED message before instance2 is killed.
   and that broker 2 is still alive after instance 2 is killed.)
--Wait 120 seconds for recovery.
--After recovery, mdb receives messages from the Queue1 and inserts it to DB_A.
--Verify the results from database,
    verifyxa, DB_A: 3;  DB_B: 3;  MQ: 3|#]
--retsart the instance2
   where the broker 2 has 0 PREPARED message after instance2 is started.
http://sqe-hudson.us.oracle.com:8080/hudson/view/Sherry_v31/job/sherry-cliapp3-t1/

4b) kill broker and check recovery
--During WaitPoint, kill broker2.
   (Checked that the broker 2 has 3 PREPARED message before broker 2 is killed.
   and that instance 2 is still alive after the broker 2 is killed.)
--Wait for 120 seconds for recovery.
--After recovery, mdb receives messages from the Queue1 and inserts it to DB_A.
--Verify the results,
   verifyxa, DB_A: 3;  DB_B: 3;  MQ: 0|#]
--kill instance2
--Restart instance 2, which restarts the broker 2
   where the broker 2 has 3 PREPARED messages after instance2 is started.
http://sqe-hudson.us.oracle.com:8080/hudson/view/Sherry_v31/job/sherry-cliapp3-t2/

III. Bug
glassfish-3.1-b29.zip
No recovery for PREPARED data in queue with a broker failure
https://glassfish.dev.java.net/issues/show_bug.cgi?id=14742

IV.  Related server.log
Test 1: Recovery with a glassfish instance failure.
As a compersion, the recovery works fine with a glassfish instance failure.
1.1 The glassfish instance 2 is killed after its server.log shows FailPoint : [2].
[#|2010-11-16T18:22:37.988-0800|WARNING|glassfish3.1|
javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|
_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [2]|#]

1.2 Then, glassfish instance 1 log shows GMS messages about instance 2 failure
and other app output, e.g. mdb receives data from the queue,
[#|2010-11-16T18:23:41.268-0800|INFO|glassfish3.1|ShoalLogger|
_ThreadID=15;_ThreadName=Thread-1;|
GMS1019: The following member has failed: clustered_instance_2 of Group: sqe-cluster|#]
.......
[#|2010-11-16T18:25:10.969-0800|INFO|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|
mdb: txMsg=ABC48080MQ1|#]

1.3 Then, glassfish instance 3 log shows GMS messages about instance 2 failure similarly,
but also usage of instance 2 tx log and  recovery message, e.g.
[#|2010-11-16T18:23:41.154-0800|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|
GMS1019: The following member has failed: clustered_instance_2 of Group: sqe-cluster|#]
......

[#|2010-11-16T18:24:41.790-0800|INFO|glassfish3.1|
javax.enterprise.system.core.transaction.com.sun.enterprise.transaction.jts.recovery|
_ThreadID=15;_ThreadName=Thread-1;|Checking Lock File /export/hudson/workspace/sherry-cliapp3-t1/glassfish3/glassfish/nodes/localhost/clustered_instance_2/
logs/clustered_instance_2/tx/recoverylockfile|#]
......
[#|2010-11-16T18:24:45.761-0800|INFO|glassfish3.1|
javax.enterprise.resource.resourceadapter.com.sun.enterprise.resource.recovery|
_ThreadID=15;_ThreadName=Thread-1;|
JMS resource recovery has created CFs = 3|#]

[#|2010-11-16T18:24:59.679-0800|INFO|
glassfish3.1|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound|
_ThreadID=15;_ThreadName=Thread-1;|Recovery of Inbound Transactions started.|#]

Test 2  Recovery with a glassfish broker failure,
2.1 The broker 2 is killed after the glassfish instance 2 server.log shows FailPoint : [2].
[#|2010-11-16T18:20:45.161-0800|WARNING|glassfish3.1|
javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|
_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [2]|#]
.....
[#|2010-11-16T18:21:05.551-0800|INFO|glassfish3.1|javax.resourceadapter.mqjmsra.outbound.connection|
_ThreadID=15;_ThreadName=Thread-1;|MQJMSRA_CL1101: onEvent:Connection Event for mc=1 :xacId=:event:E206:[E206]: Connection closed.  The connection is closed due to a network problem, broker crashed, or internal error: localhost:48686(38773), com.sun.messaging.jms.notification.ConnectionClosedEvent[source=BrokerAddress=localhost:48686(38773), ConnectionID=2109703705338095360, ReconnectEnabled: true, IsConnectedToHABroker: false]|#]


2.2  Glassfish instance 1 log doesn't the usage of instance 2 tx log, but seems not recover.

2.3  Glassfish instance 3 log shows usage of instance 2 tx log.
[#|2010-11-16T18:25:55.184-0800|INFO|glassfish3.1|
javax.enterprise.system.core.transaction.com.sun.enterprise.transaction.jts.recovery|
_ThreadID=15;_ThreadName=Thread-1;|
Checking Lock File /export/hudson/workspace/sherry-cliapp3-t2/glassfish3/glassfish/nodes/localhost/clustered_instance_2/logs/clustered_instance_2/tx/recoverylockfile|#]

[#|2010-11-16T18:25:55.197-0800|INFO|glassfish3.1|
javax.enterprise.system.core.transaction.com.sun.enterprise.transaction.jts.recovery|
_ThreadID=15;_ThreadName=Thread-1;|Recovering? false|#]