users@glassfish.java.net

Re: Need help for glassfish 3.1 clustering

From: Bobby Bissett <bobby.bissett_at_oracle.com>
Date: Wed, 22 Jun 2011 15:54:41 -0400

On 6/22/11 9:50 AM, forums_at_java.net wrote:
> Hi,
>
> I have similar problem with glassfish clustering.
>
> sessions from my app, not replicated to another node, but in all other
> aspects cluster works correctly.

Can you start a new thread with a subject that describes the issue?
Having the "asadmin get-health" output is very good to include, as that
shows that gms is working and the problem is somewhere higher up in the
stack. Before you do, though, and my apologies if this is obvious -- did
you deploy your app with the "availabilityenabled" flag set to true? You
can verify that HA is on for you app in the admin console as well.

One more comment below:
>
> *asadmin list-instances -l*
> NAME HOST PORT PID
> CLUSTER STATE
> portal-instance1 localhost 24848 6343 portal-cluster
> running
> portal-instance3 fss-portal3 24848 4628 portal-cluster running
> Command list-instances executed successfully.
>
> *asadmin get-health portal-cluster*
> portal-instance1 started since Wed Jun 22 17:17:20 MSD 2011
> portal-instance3 started since Wed Jun 22 17:17:21 MSD 2011
> Command get-health executed successfully.
>
>
> When i'm running asadmin validate-multicast --multicastaddress
> 224.0.0.251
> --multicastport 24567 --bindaddress 172.17.12.172 --timeout 45 on
> second node
> in logs appears these error:
>
> [#|2011-06-22T17:35:16.680+0400|WARNING|glassfish3.1|ShoalLogger|_ThreadID=29;_ThreadName=Thread-1;|GMS1071:
>
> damaged multicast packet discarded
> java.lang.IllegalArgumentException: magic number is not valid
> at
> com.sun.enterprise.mgmt.transport.MessageImpl.parseHeader(MessageImpl.java:172)
>
> at
> com.sun.enterprise.mgmt.transport.BlockingIOMulticastSender$MessageProcessTask.run(BlockingIOMulticastSender.java:349)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:662)

That means you're receiving messages from some other running server
while the validate-multicast check is going on. You can see what those
messages are by running with the --verbose flag. You might want to check
your server.log as well to see if there are similar errors. If you have
other processes using the same multicast channels, there's a chance they
could be interfering with each other, though that's not supposed to
happen in GF 3.1.

Cheers,
Bobby