users@glassfish.java.net

Re: problem with running ejb client

From: <glassfish_at_javadesktop.org>
Date: Sat, 17 Jan 2009 08:37:31 PST

Let me clarify.

There are two parts to load-balancing with multiple endpoints: I'll call them "bootstrapping" and "normal." (I'm sure the ORB folks have a better name than "normal" but you'll see what I mean in a moment.) The system property assigns a list of endpoints used for bootstrapping from a client. The server-side ORBs themselves maintain a list of all the currently-active server-side ORBs in the cluster. As instances start and stop all the ORBs keep track of all the other active ORBs.

Bootstrapping: When the an execution of a client first needs to connect to a server (to locate a remote EJB for example) it "bootstraps" into the set of all currently-active server ORBs. The client-side ORB does this by contacting one of the server-side ORBs listed in the endpoints system property. It tries to connect to the first one in the property value and, If that fails, it will try the next endpoint in the list, continuing until it exhausts the list OR it is able to contact an ORB on a server. This ends the bootstrapping phase. From now on, for the life of that execution of the client, the client does not use the endpoints system property.

Normal: Once the client ORB has contacted one of the server ORBs listed in the system property, that server ORB sends to the client a list of all the active server ORBs in the cluster at that moment. Note that the contents of this currently-active-ORBs list can be different from the list you defined in the system property.

In fact, every time a server ORB sends a message to a client ORB it tells the client ORB if any server ORBs have entered or left the cluster (due to starting or stopping for example). Essentially, at all times once the client bootstrapping has finished, the client ORB has a list of all currently-active server-side ORBs in the cluster. If the client has been communicating with one of those ORBs but can no longer do so, it will fail over to one of the other active ORBs it was told about.

Remember that once the client has bootstrapped, it will use the list of all active ORBs in the cluster, which might be different from the endpoint property value it used for bootstrapping.

Here's a concrete example. Suppose you have a cluster containing instances host1, host2, and host3. You have defined the endpoints property for a client to refer to hosts 1 and 3. But at the moment host 1 is down; 2 and 3 are up.

When the client first needs to contact a server ORB it tries host1 (because that is the first one in its version of the endpoints system property). It cannot contact host1 because host1 is down, so it tries to bootstrap using host3 because that is the next host listed in the system property. This attempt succeeds. As part of the conversation between the client ORB and the ORB on host3 the client ORB learns that host2's ORB is also active. So the client now knows that the cluster contains active nodes host2 and host3.

So at first the client sends messages to host3 (since that was the one it connected to for bootstrapping). Suppose host3 now fails. The next time the client tries to send a message to host3 that attempt fails. The client-side ORB knows from the earlier conversation with host3's ORB that host2 is active. So the client-side ORB will start sending its traffic to host2.

Now suppose host1 comes on-line. As part of the next conversation between the client and host2 (to which it is now sending requests) the client will learn that host1 is now available as well as host2. If later on during the client's execution host2 fails, the client can fail over to host1.

So, as an administrator, you would want to make sure that the endpoints property definition used by a client for bootstrapping refers to instances that are likely to be up most of the time. Or at least you want it to be very unlikely that ALL of those instances would be down at the same time. If other instances are added or removed from the cluster, or started or stopped, you do not need to update the property definition. As long as the client can bootstrap into one of the instances listed in the endpoints property, it will have access to all of the active instances whether they are listed in the property or not.

Does this help?

- Tim
[Message sent by forum member 'tjquinn' (tjquinn)]

http://forums.java.net/jive/thread.jspa?messageID=326646