Re: IIOP load balancing change in r1.15.2.1

From: Mark Williams <Mark.Williams_at_Sun.COM>
Date: Thu, 27 Aug 2009 09:44:53 -0700

> Hi Mark,

> Thank you for your detailed reply.

>> Not true. There was never any supported feature to randomize or loadbalance

>> iiop targets in GFv2. The feature is FOLB (fail over load balancing)

>> meaning

>> you connect to the first target past in always. Later if that fails,

>> try the

>> next in the list, and so on until you get a new connection or fail.

> I am not sure what the function is named, but I was referring to the

> following function:

> Sun Java System Application Server 9.1 High Availability Administration

> Guide

> http://dlc.sun.com/pdf/819-3679/819-3679.pdf

> Chapter 11: RMI-IIOP Load Balancing and Failover

> When a client performs a JNDI lookup for an object, the Naming Service

> creates a InitialContext (IC) object associated with a particular server

> instance.

> [...]

> For that InitialContext object, the load balancer directs lookup

> requests and other InitialContext operations to the first endpoint on

> the randomized list.

> [...]

> Each time the client subsequently creates a new InitialContext object,

> the endpoint list is rotated so that a different IIOP endpoint is used

> for InitialContext operations.

> So if you pass a list of endpoints, for each bean (i.e. separate JNDI

> look-ups, not per SLSB request as with the new feature Ken referred to)

> you should get round-robin load balancing.

> But this was not officially supported?

No, if you pass in a list of endpoints, the server will try to connect to the

first server on the list always. The "randomization" only happens if that

first endpoint fails after it has been utilized. So there is no load balancing

happening ever, *except* when there is a failure, (FOLB, which means

Fail Over Load Balancing).

So if you pass in 3 endpoints "A", "B" and "C" in your application to make the

initial connection, "A" is tried first. If that is obtained (99.9% of the time...)

it is used from that point on for all operations until we are through with it

or until it fails. If when we pass in "A", "B" and "C" for that initial connection

and "A" fails to connect, we then try "B". If that connects we use it instead.

If "B" fails to connect, we try "C"... all in FIFO (first in first out) order for that

initial connection where we pass in the list of end points to use.

Now say we have a connection "A" and it's working.

Let's also say we are doing lots of stuff on it, and it takes 10 minutes of

constant use reading and writing across the connection. During that time

"B" goes down, another server is started with a new endpoint, "D", and "B"

comes back up or whatever. Then "A" goes down, and we are still trying

to use it. Now we are in a FOLB situation. (We have a good connection

"A" that we have been using, but it is now bad). So *only now* does the

system try to get a list of the possible failover endpoints and it creates

a "randomized list" (actually pseudo random but that's another issue)

where it combines the list of endpoint connections the application initially

past in ("A", "B" and "C" remember?) along with a list of all available

endpoints that are currently online and able to handle the requests.

("B", "C" and let's say "D" from the new server that joined). The system

will try one of these and try to establish a new connection. If it works,

it will return and keep using that new connection. (Probably "C" in this case).

If "C" doesn't connect, it will go to the next endpoint in the "randomized list"

of FOLB targets and keep trying them one by one until it returns a successful

connection, or they all fail and an exception will be thrown.

The full text of the Chapter 11 excerpt quoted below is deceiving.

There is no load balancing performed, at all, until there is a failover

situation. (emphasis added)

When a client performs a JNDI lookup for an object, theNaming Service creates a

InitialContext (IC) object associated with a particular server instance. From then on, all

lookup requests made using that IC object are sent to the same server instance. All EJBHome

objects looked up with that InitialContext are hosted on the same target server. Any bean

references obtained henceforth are also created on the same target host. This effectively

provides load balancing, since all clients randomize the list of live target servers when creating

InitialContext objects. If the target server instance goes down, the lookup or EJB method

invocation will failover to another server instance.

This should be more clear in stating that a randomized list of live target

servers is only generated in a failover situation when the Initial Context

object's target fails and FOLB is enacted to find another target candidate.

The Initial Context will *always* try to connect to the first endpoint your

application passes into it. And if that (initial) connection fails, the second

endpoint target you passed into, then the third, etc. before it fails. No

"randomization" happens unless there is a FOLB situation.

(and even then it's not truly "random" but the combination of the original

list of endpoints provided and a list of all the active endpoints in the

cluster.).

>> Yes there is.

>> The functionality you thought was IIOP load balancing did not work as you

>> thought, and was in fact FOLB (Fail Over Load Balancing) in which the server

>> should use FIFO logic to do initial connection and then (only in a fail over

>> situation) try the other connections passed in (IN FIFO order).

> From the details you included I understand the problem when using LB

> when a server fails: As each InitialContext creation gets a full list of

> (original) targets passed, it could try to access a server again that

> has failed. FOLB relies on the pattern where the endpoints are passed

> through only once at the beginning, after which GF maintains a list of

> working targets (expanding as servers are added, decreasing as servers

> are removed or have failed). Is that summary correct?

> Thanks!

> Dies

Pretty close. Actually the pseudo random list that gets generated

is just a combination of the list of endpoints the application past in,

(in reverse order) along with the list of active endpoints in the cluster

at that moment in time. The list is "stupid" in that it doesn't remove

any of the original endpoints and just relies on sorting them in

reverse order along with combining it with all the live endpoints will

yield a good connection in at most two tries. "failure" happens after

all of the endpoints in the list have all been tried and fail.

So if B,C & D are all active, and we passed in "A,B,C", the list

might look something like "C,B,A,B,C,D" or something (I forget

exactly). So C would be tried first, and if it connects, we are done.

If C fails, B is next, then A (which we know is bad but try anyway,

I said it was a stupid list...) and then we try B and C again and finally D.

If all of those fail, we wail and throw an exception.