I'm having a related issue, but not exactly the same.
I have a test cluster that is configured for in-memory replication and this is not a problem when I used file based persistence. Though with file based persistence using an nfs mounted directory we end up with different issues which I may post in another thread.
domain: domain1
das: test
instance1: prweb2-test
instance2: prweb3-test
General Environment Info:
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_15-b04)
Java HotSpot(TM) Server VM (build 1.5.0_15-b04, mixed mode)
RedHat AS: Linux test.domain.com 2.6.9-55.0.2.ELsmp #1 SMP Tue Jun 12 17:58:20 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
asadmin version yields
Version = Sun Java System Application Server 9.1_02
Command version executed successfully.
I am running the open source version without hadb.
All 3 of the physical servers are on the same subnet
In this scenario the the simple clusterjsp works correctly with a simple round robin lb without sticky sessions.
When I try to replicate this configuration to another set of servers/nodeagents the replication is not working even in the clusterjsp application.
This is what I am seeing in the logs when I turn logging up to FINE
[#|2008-06-06T03:25:21.869-0400|FINE|sun-appserver9.1|org.apache.jasper.servlet.JspServlet|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=org.apache.jasper.servlet.JspServlet;MethodName=service;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|JspEngine --> [/HaJsp.jsp] ServletPath: [/HaJsp.jsp] PathInfo: [null] RealPath: [/usr/local/glassfish/nodeagents/prweb2/www-hostb2/applications/j2ee-apps/clusterjsp/clusterjsp_war/HaJsp.jsp] RequestURI: [/clusterjsp/HaJsp.jsp] QueryString: [null]|#]
[#|2008-06-06T03:25:21.870-0400|FINE|sun-appserver9.1|org.apache.coyote.tomcat5.InputBuffer|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=org.apache.coyote.tomcat5.InputBuffer;MethodName=realReadBytes;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|realRead() R( /clusterjsp/HaJsp.jsp)|#]
[#|2008-06-06T03:25:21.871-0400|INFO|sun-appserver9.1|javax.enterprise.system.stream.out|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;|
Add to session: test = test|#]
[#|2008-06-06T03:25:21.872-0400|FINE|sun-appserver9.1|org.apache.coyote.tomcat5.OutputBuffer|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=org.apache.coyote.tomcat5.OutputBuffer;MethodName=setConverter;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|Got encoding: ISO-8859-1|#]
[#|2008-06-06T03:25:21.872-0400|FINE|sun-appserver9.1|javax.enterprise.system.container.web|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=com.sun.enterprise.ee.web.sessmgmt.JxtaBackingStoreImpl;MethodName=saveSimple;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|JxtaBackingStore>>saveSimple():id = cbe68c81c3f45d9ecab547876ef7unable to proceed due to health check|#]
[#|2008-06-06T03:25:21.873-0400|FINE|sun-appserver9.1|org.apache.coyote.tomcat5.OutputBuffer|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=org.apache.coyote.tomcat5.OutputBuffer;MethodName=realWriteBytes;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|realWrite(b, 0, 1633) org.apache.coyote.Response_at_1aa8d4|#]
[#|2008-06-06T03:25:21.873-0400|FINE|sun-appserver9.1|org.apache.coyote.tomcat5.OutputBuffer|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=org.apache.coyote.tomcat5.OutputBuffer;MethodName=recycle;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|recycle()|#]
The key problem on this 2nd domain seems to be this line:
[#|2008-06-06T03:25:21.872-0400|FINE|sun-appserver9.1|javax.enterprise.system.container.web|_ThreadID=34;_ThreadName=httpSSLWorkerThread-20080-0;ClassName=com.sun.enterprise.ee.web.sessmgmt.JxtaBackingStoreImpl;MethodName=saveSimple;_RequestID=4889d2d5-a4dd-458f-ac67-bb8f4afb3331;|JxtaBackingStore>>saveSimple():id = cbe68c81c3f45d9ecab547876ef7unable to proceed due to health check|#]
I don't have health check turned on, so I don't know why it should fail that check.
I have looked at some source from the diffs that I found through google and found this interesting bit of code at:
http://72.14.205.104/search?q=cache:asgc77-i9jQJ:fisheye5.cenqua.com/browse/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/JxtaBackingStoreImpl.java%3Fr%3D1.19+unable+to+proceed+due+to+health+check+glassfish&hl=en&ct=clnk&cd=1&gl=us&client=firefox-a
which lead me to the class and method:
ReplicationHealthChecker.isOkToProceed()
Which leads me to this method along with the interesting comments that surround the HealthCheckingEnabled flag check logic
/**
* return boolean reflecting whether it is ok to proceed
* with replication processing
*/
public static boolean isOkToProceed() {
/* FIXME we can put this back later
if( !isHealthCheckingEnabled() ) {
return true;
}
*/
//flushing time is treated specially
if(isFlushing()) {
return true;
}
//cluster stopping time is treated specially
if(isClusterStopping()) {
return false;
}
//in the midst of attempting connection
if(isAttemptingConnection()) {
return false;
}
boolean condition = isReplicationPartnerOperational()
&& isReplicationCommunicationOperational();
if(condition) {
return true;
}
synchronized(_monitor) {
if(!condition) {
reportError("ReplicationHealthChecker:health failure " +
" isReplicationPartnerOperational()=" + isReplicationPartnerOperational() +
" isReplicationCommunicationOperational()=" + isReplicationCommunicationOperational());
}
}
return condition;
}
This again was gotten from google cache at:
http://72.14.205.104/search?q=cache:uzWGXB7H3K0J:fisheye5.cenqua.com/browse/~raw,r%3D1.9.2.13/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/ReplicationHealthChecker.java+ReplicationHealthChecker&hl=en&ct=clnk&cd=1&gl=us&client=firefox-a
Any help in what could be triggering the HealthCheck to not be ok in a memory based replication scenario would be greatly appreciated.
I am continuing to rebuild clusters trying to make them behave/work like my single working cluster.
Thanks
[Message sent by forum member 'awizardly' (awizardly)]
http://forums.java.net/jive/thread.jspa?messageID=278790