users@glassfish.java.net

How does SynchronizationServlet work?

From: <glassfish_at_javadesktop.org>
Date: Wed, 06 May 2009 10:38:01 PDT

Here's what I noticed after a period of time using my Glassfish app cluster:

- I have 2 servers, appserv01 and appserv02.
- DAS running on appserv01 together with a node agent app01-agent
- Another node agent app02-agent running on appserv02
- I have a cluster my-app-cluster running with app01-instance and app02-instance on each server respectively

Typically, when I start up my cluster, the SynchronizationServlet will kick in and be done in about 10 seconds. Quick enough.

Let's say I deploy an EAR file, say about 20MBs in size while the cluster is active. The deployment is successful and I can use the application immediately (as expected). However, if I were to stop and restart the cluster at this point of time (after deploying the EAR file), SynchronizationServlet would take about 120 seconds to synchronize. Not ideal but still acceptable.

If I were to deploy 5 EAR files, each about 20MBs in size, the story takes a really nasty turn. If I thought that the sync period would take 120 * 5 == 10 minutes, I'd be sadly mistaken. The time taken to synchronize seems to increase exponentially. After deploying 5 EAR files, I'd be waiting for half an hour, I'd get a marshalling error signalling that basically the call has timed out and my startup fails.

If I grab the process of the Sync servlet and kill it, Glassfish would proceed to startup, but whatever is being sync'd at the time would be broken and I'd have to deploy it.

So now, I'm stuck in a situation where I have to deploy an EAR file, restart the entire cluster, deploy another EAR file, restart ad inifinitum until I get everything deployed. Needless to say, it's taking a heckuva long time to do a large deployment.

Would appreciate it if anyone could tell me if this is something that other people experience as well and if this is a bug of some kind. I'm pretty sure that this is not ideal if it's prevalent amongst most cluster users.

This happens in a cluster configuration for both V2UR2 and V2.1. Does not seem to happen in non-cluster configurations.

Thanks,
Wong
[Message sent by forum member 'lilwong' (lilwong)]

http://forums.java.net/jive/thread.jspa?messageID=345256