users@glassfish.java.net

Re: Cluster instances stopping unexpectedly

From: VTR Ravi Kumar <vtrravikumar_at_gmail.com>
Date: Mon, 28 Dec 2009 07:53:23 -0800 (PST)

Hi all,

I am facing a similar issue in a similar clustered setup... Will be glad to
hear more this ...

Thanks

V.T.R. Ravi Kumar



narayana rallabandi wrote:
>
> Hi,
>
> We have the following cluster setup:
>
> 2 T2000 servers each with 16 GB RAM with a DAS and 2 NodeAgents with each
> NodeAgent running 4 instances.
>
> The jvm log is attached for reference:
>
>
>
>
>
> We are observing that one of the instances (out of 1-8) failing randomly:
> The following is the server.log while the instance being stopped.
>
> [#|2009-12-12T09:33:22.196+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;|GMS
> View Change Received for group tnhsCluster : Members in view for
> IN_DOUBT_EVENT(before change analysis) are :
> 1: MemberId: tnhsapp1Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503301C1D714797C4ED7AF9FC8C8E378A3DB03
> 2: MemberId: tnhsapp1Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033079C026A9FB24D1B985A9ED0B16A60C303
> 3: MemberId: tnhsapp2Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033289E7EAD376A446D90E7B5E42DE644A703
> 4: MemberId: server, MemberType: SPECTATOR, Address:
> urn:jxta:uuid-59616261646162614A787461503250332DE658F932AB436995B78E0CB3E080DA03
> 5: MemberId: tnhsapp2Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A787461503250335F28AD852F3B4909ACECA1BB9A0A15FB03
> 6: MemberId: tnhsapp1Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503373B4952071204FB4A46BDBC0BC6BE01003
> 7: MemberId: tnhsapp1Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033A06C37606F1A43A78527C6D55E118B7503
> 8: MemberId: tnhsapp2Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033CDDCEC6F754A4D529D77C9209E35893003
> 9: MemberId: tnhsapp2Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033EB9B700966564E8DB18C898BCDB34F7F03
> |#]
>
> [#|2009-12-12T09:33:44.765+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;IN_DOUBT_EVENT;server;tnhsCluster;|Analyzing
> new membership snapshot received as part of event : IN_DOUBT_EVENT for
> Member: server of Group: tnhsCluster|#]
>
> [#|2009-12-12T09:33:50.608+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;server;tnhsCluster;|Received
> FailureSuspectedEvent for Member: server of Group: tnhsCluster|#]
>
> [#|2009-12-12T09:34:01.622+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=48;_ThreadName=com.sun.enterprise.ee.cms.impl.common.Router
> Thread;server;|Sending FailureSuspectedSignals to registered Actions.
> Member:server...|#]
>
> [#|2009-12-12T09:36:12.106+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;|GMS
> View Change Received for group tnhsCluster : Members in view for
> IN_DOUBT_EVENT(before change analysis) are :
> 1: MemberId: tnhsapp1Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503301C1D714797C4ED7AF9FC8C8E378A3DB03
> 2: MemberId: tnhsapp1Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033079C026A9FB24D1B985A9ED0B16A60C303
> 3: MemberId: tnhsapp2Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033289E7EAD376A446D90E7B5E42DE644A703
> 4: MemberId: server, MemberType: SPECTATOR, Address:
> urn:jxta:uuid-59616261646162614A787461503250332DE658F932AB436995B78E0CB3E080DA03
> 5: MemberId: tnhsapp2Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A787461503250335F28AD852F3B4909ACECA1BB9A0A15FB03
> 6: MemberId: tnhsapp1Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503373B4952071204FB4A46BDBC0BC6BE01003
> 7: MemberId: tnhsapp1Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033A06C37606F1A43A78527C6D55E118B7503
> 8: MemberId: tnhsapp2Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033CDDCEC6F754A4D529D77C9209E35893003
> 9: MemberId: tnhsapp2Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033EB9B700966564E8DB18C898BCDB34F7F03
> |#]
>
> [#|2009-12-12T09:36:23.985+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=49;_ThreadName=Thread-34;_RequestID=622ff36a-9f66-4a07-85de-05df81331482;|Failed
> to restart: java.rmi.NoSuchObjectException: no such object in table|#]
>
> [#|2009-12-12T09:36:35.779+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;IN_DOUBT_EVENT;server;tnhsCluster;|Analyzing
> new membership snapshot received as part of event : IN_DOUBT_EVENT for
> Member: server of Group: tnhsCluster|#]
>
> [#|2009-12-12T09:37:05.099+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;server;tnhsCluster;|Received
> FailureSuspectedEvent for Member: server of Group: tnhsCluster|#]
>
> [#|2009-12-12T09:37:16.344+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;|viewQueue
> size before take 1 for group: tnhsCluster|#]
>
> [#|2009-12-12T09:37:16.345+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=48;_ThreadName=com.sun.enterprise.ee.cms.impl.common.Router
> Thread;server;|Sending FailureSuspectedSignals to registered Actions.
> Member:server...|#]
>
> [#|2009-12-12T09:37:47.573+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;|GMS
> View Change Received for group tnhsCluster : Members in view for
> IN_DOUBT_EVENT(before change analysis) are :
> 1: MemberId: tnhsapp1Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503301C1D714797C4ED7AF9FC8C8E378A3DB03
> 2: MemberId: tnhsapp1Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033079C026A9FB24D1B985A9ED0B16A60C303
> 3: MemberId: tnhsapp2Instance2, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033289E7EAD376A446D90E7B5E42DE644A703
> 4: MemberId: tnhsapp2Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A787461503250335F28AD852F3B4909ACECA1BB9A0A15FB03
> 5: MemberId: tnhsapp1Instance3, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A7874615032503373B4952071204FB4A46BDBC0BC6BE01003
> 6: MemberId: tnhsapp1Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033A06C37606F1A43A78527C6D55E118B7503
> 7: MemberId: tnhsapp2Instance4, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033CDDCEC6F754A4D529D77C9209E35893003
> 8: MemberId: tnhsapp2Instance1, MemberType: CORE, Address:
> urn:jxta:uuid-59616261646162614A78746150325033EB9B700966564E8DB18C898BCDB34F7F03
> |#]
>
> [#|2009-12-12T09:38:17.545+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;IN_DOUBT_EVENT;tnhsapp1Instance2;tnhsCluster;|Analyzing
> new membership snapshot received as part of event : IN_DOUBT_EVENT for
> Member: tnhsapp1Instance2 of Group: tnhsCluster|#]
>
> [#|2009-12-12T09:38:26.060+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=12;_ThreadName=ViewWindowThread:tnhsCluster;tnhsapp1Instance2;tnhsCluster;|Received
> FailureSuspectedEvent for Member: tnhsapp1Instance2 of Group:
> tnhsCluster|#]
>
> [#|2009-12-12T09:38:29.250+0530|INFO|sun-appserver2.1|ShoalLogger|_ThreadID=48;_ThreadName=com.sun.enterprise.ee.cms.impl.common.Router
> Thread;tnhsapp1Instance2;|Sending FailureSuspectedSignals to registered
> Actions. Member:tnhsapp1Instance2...|#]
>
> [#|2009-12-12T09:40:47.304+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=49;_ThreadName=Thread-34;_RequestID=622ff36a-9f66-4a07-85de-05df81331482;|Failed
> to check connection: java.rmi.NoSuchObjectException: no such object in
> table|#]
>
> [#|2009-12-12T09:40:47.308+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=49;_ThreadName=Thread-34;_RequestID=622ff36a-9f66-4a07-85de-05df81331482;|stopping|#]
>
> [#|2009-12-12T09:58:00.261+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=50;_ThreadName=Thread-29;_RequestID=dc5a8704-7448-43e0-b5fc-d0879b30bd7a;|Failed
> to restart: java.rmi.NoSuchObjectException: no such object in table|#]
>
> [#|2009-12-12T10:01:16.085+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=50;_ThreadName=Thread-29;_RequestID=dc5a8704-7448-43e0-b5fc-d0879b30bd7a;|Failed
> to check connection: java.rmi.NoSuchObjectException: no such object in
> table|#]
>
> [#|2009-12-12T10:01:27.472+0530|WARNING|sun-appserver2.1|javax.management.remote.misc|_ThreadID=50;_ThreadName=Thread-29;_RequestID=dc5a8704-7448-43e0-b5fc-d0879b30bd7a;|stopping|#]
>
> Any help is greatly appreciated.
> Thanks,
> Narayanaa
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>

-- 
View this message in context: http://old.nabble.com/Cluster-instances-stopping-unexpectedly-tp26755473p26944471.html
Sent from the java.net - glassfish users mailing list archive at Nabble.com.