users@glassfish.java.net

Cluster SynchronizationException

From: <glassfish_at_javadesktop.org>
Date: Wed, 16 Jul 2008 18:14:23 PDT

Hello,

We have been evaluating Glassfish for one of our current projects and have been kicking
the tires quite extensively. We have worked through some minor annoyances, but in
general so far so good.

The most appealing feature of Glassfish for our current needs is centralized cluster
management and the automated configuration of the load balancer.

Unfortunately we are experiencing problems when starting an instance in a cluster.

Our current test server configuration is:
Centos 5.1 running in VMWare Server (SELinux and ipv6 are off).
JDK 1.6u6
Glassfish v2ur2-b04-linux running as root
Network identity by DHCP.
System clocks are synchronized with the host using vmware-tools.

The DAS and Node Agents are running in separate servers on the same subnet.
There is nothing in-between to interfere with traffic.

The problem:

When cold starting a Node Agent which is configured to automatically start all
instances synchronization succeeds, albeit extremely slowly – up to 5 minutes.

When starting an instance manually, through the console or by command line,
synchronization blocks indefinitely.

On the Node Agent side we see the following in the instance log files after a period of time:

[#|2008-07-17T09:21:46.419+1000|INFO|sun-appserver9.1|javax.ee.enterprise.system.tools.synchronization|_ThreadID=11;
  _ThreadName=sync-1;|SYNC014: Unable to update synchronization timestamp.
com.sun.enterprise.ee.synchronization.SynchronizationException: Error while updating timestamp for synch request: ${com.sun.aas.instanceRoot}/applications/
        at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:153)
        at com.sun.enterprise.ee.synchronization.BaseRequestMediator.commit(BaseRequestMediator.java:151)
        at com.sun.enterprise.ee.synchronization.BaseRequestMediator.run(BaseRequestMediator.java:126)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
        at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:85)
        ... 3 more
|#]

There are no exceptions or warnings in the DAS logs.

The problem was also reported in this tread - similar configuration:
 http://forums.java.net/jive/thread.jspa?threadID=36920&tstart=15

In an attempt to troubleshoot this we have tried the following:
 1. Using Glassfish version v2.1-b24d-linux – same issue
 2. Monitoring traffic between servers – nothing revealing
 3. Following the tuning guidelines in this document:
       http://docs.sun.com/app/docs/doc/819-3681/abeir?a=view
 4. Following the numerous oblique references to loopback address issues.
 5. Lots of Googling and forum trawling.
 6. Help!

Could it be an issue with Centos / VMWare, the JDK, some OS / GF configuration switch?

Any ideas?

Thanks.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287180