users@glassfish.java.net

Re: Excessive logging from ShoalLogger up to 1GB per day

From: Joseph Fialli <joe.fialli_at_oracle.com>
Date: Wed, 18 May 2011 09:47:27 -0400

  This reply consists of multiple parts.

First a Workaround for the log files growing too big. Then a request for
you to submit a glassfish jira issue with
some of the log files attached to allow us to investigate this issue you
are observing. Lastly, some
system tuning of UDP buffer size may be necessary depending on your OS
and how heavy a load your system
is placing on the machines running the application servers.


**************

1. Workaround.

First of all, if you would like to disable these messages being
recorded, you can run the following commands.

// for each cluster, run the following command.
% asadmin set-log-levels --target <clustername> ShoalLogger.mcast=WARNING

// for DAS, run the following command.
% asadmin set-log-levels ShoalLogger.mcast=WARNING

This is merely a workaround to address the issue that you reported that
the logs are getting too large too quickly. There is still an
underlying issue
requiring investigation.


*************

2. Submitting a GlassFish jira issue and attaching log files.

However, there is a problem that I would like to investigate in your
server logs.
It would be helpful if you could file a bug at
http://java.net/jira/secure/CreateIssue.jspa?pid=10231&issuetype=1
with Component type of group-management-service. The subject line should
read "Too many failures to rebroadcast GMS broadcast notifications".
If you could attach all the server logs from
the DAS and one of the clusters (with the logs all corresponding to same
time period.) it would assist
us in diagnosing what is going wrong. There is an asadmin command to
collect log files:

// for DAS
% asasadmin collect-log-files

// for a cluster
% asadmin collect-log-files --target <clustername>

Attach these zip files to the newly created issue.
(if the files are too big, you can selectively choose files from DAS and
clustered instances logs.
Just please be sure to pick time periods across DAS and clustered
instances. Best to select from
start of DAS and cluster to when the issues start occuring.)

Each server log entry corresponds to an instance in the cluster
detecting that it has missed a GMS notification
and requesting it be rebroadcast. However, when the request for
rebroadcast arrives, the GMS notification
had already expired in the GMS master. There is an issue that requires
investigation that this is occuring so much.

************

3. Tuning UDP buffer sizing:

There does exist a chance that this is occurring due to insufficient UDP
buffersize configured.
Even if that is the case, it would not be expected to see so many
errors. The rebroadcast fail only
after 20 seconds after the initial broadcast. The DAS would need to be
running on a highly overloaded
machine for processing to fall that far behind. When a system falls
behind processing UDP
messages, they are just dropped. GlassFish GMS code is written to
compensate for the drops and typically
the rebroadcast will minimally. There are way too many in your system
based on what you are reporting.

It is OS specific to increase the UDP buffersize. We have found systems
such as Linux are configured with
too small a UDP buffersize by default (131071) for server like
processing and we have had to increase from the default to 500K
for our testing configurations to see no UDP drops. As with tuning UDP
buffersize in the OS, it is OS specific to
verify if there have been UDP message drops or not. You may want to
check your system for UDP drops to see if they
are excessive or not.

Additionally, running multiple instances on a single machine can result
in UDP drops if the machines are
too underpowered to process the UDP broadcast messages in a timely fashion.
The dropped UDP messages in GMS result in dropped notifications such as
the JOINING of a GlassFish instance to the glassfish cluster.
However, the very next message notification has the correct cluster view
with it and makes it so things will work sufficiently
despite the drop. However, code relying on the GMS notification may not
work correctly if it misses the notification of
an clustered instance joining or leaving. So it would still be helpful
to get the logs and figure out what the issue is
that is causing so many log messages.



Thanks for using Glassfish and thanks in advance for any assistance you
can provide us to investigate this issue.

-Joe Fialli




On 5/18/11 6:17 AM, forums_at_java.net wrote:
> We've recently installed 2 GlassFish 3.1 clusters. We have 2 physical
> machines, one running the DAS and 2 nodes, one for each cluster and
> the other
> machine also running 2 nodes, one for each cluster. At a random time
> after
> starting the cluster we notice the server log filling up quite rapidly at
> about 1GB of logs per day, allways containing the same message: unable to
> find message to resend broadcast event with masterViewId The same
> messages
> are repeated over and over again, we've restarted the clusters, after
> that
> these messages stopped but they reappeared some time after. Does
> anyone have
> a clue what's going on, the cluster seems to work just fine and none
> of the
> nodes failed to my knowledge.
>
>
>
> Thanks!
>
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 48 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=62;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 60 to
> member: g1 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 49 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 50 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=62;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 61 to
> member: g1 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 51 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 52 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=62;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 62 to
> member: g1 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 53 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 54 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=62;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 63 to
> member: g1 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 55 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 56 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=62;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 64 to
> member: g1 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 57 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 58 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 59 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 60 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 61 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 62 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.562+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 63 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:45.563+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=64;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 64 to
> member: g3 of group: parleys1|#]
> [#|2011-05-17T12:56:46.324+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=38;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 37 to
> member: devoxx1 of group: devoxx|#]
> [#|2011-05-17T12:56:46.324+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=38;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 38 to
> member: devoxx1 of group: devoxx|#]
> [#|2011-05-17T12:56:46.325+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=38;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 39 to
> member: devoxx1 of group: devoxx|#]
> [#|2011-05-17T12:56:46.325+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=38;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 40 to
> member: devoxx1 of group: devoxx|#]
> [#|2011-05-17T12:56:46.325+0100|INFO|glassfish3.1|ShoalLogger.mcast|_ThreadID=38;_ThreadName=Thread-1;|GMS1112:
>
> unable to find message to resend broadcast event with masterViewId: 41 to
> member: devoxx1 of group: devoxx|#]
>
> --
>
> [Message sent by forum member 'jayv']
>
> View Post: http://forums.java.net/node/803337
>
>