Re: Load balancer

From: Pankaj Jairath <Pankaj.Jairath_at_Sun.COM>
Date: Thu, 08 Jan 2009 14:05:46 +0530

This is quite an extensive topic, I would address it like this -

1. Load Balancing is the mechanism towards achieving availability of
deployed services. There can various failure scenarios which would
result in a node/instance being not available to service the requests.
Here in Load Balancer would detect this and failover the load to another
healthy instance/node in the cluster deployment.

2. It also achieves the goal of scalability of deployments. As
enterprises grow, they face increased load and to be able to handle the
load it would require to add additional nodes into the cluster
deployment. Load Balancer helps achieving this as well in real time with
dynamic reconfig support. Usually before going live, cluster deployment
needs to be tuned (no. of nodes in the system that need to be
configured, heap size, resources, OS parameters at network level, etc)
to able to cater the expected load - for ex: peak traffic expected, this
is an extensive exercise so as to put in place a live deployment that is
capable of handling the load. With such kind of modeling being
undertaken before going live, contemporary load balancing policies like
round-robin serve the purpose.

3. Load Balancer has been designed to keep it a light weight component -
the percentage drop while using it in the request processing path ought
to be at minimal. It monitors the health of a node/instance from the eye
of "response-time" that shall be considered as healthy window towards
processing the requests. There are multiple factors which can cause this
window to expand in real-time - node/instance suffering low on memory,
contention for system resources etc. Load Balancer detects this change
in the window and would mark such an instance as unhealthy thereby
taking it out from active load balancing. Load Balancer would
periodically monitor the set of unhealthy instances and bring them back
into active load balancing when these come within the healthy window
time period. The point here is that though an instance can service
client request and is available; it is the time it takes to process the
request that is critical - one cannot have clients waiting beyond a
reasonable period, which effectively can actually cause increased load
from client with some of them retrying request which would simply get
queued up before the rejection starts at system level. Thus Load
Balancer focuses on the net outcome when an instance becomes unhealthy.

Towards this there are two mechanisms of health check by Load Balancer -
a. Passive health check
In this Load Balancer uses the request to piggy back the health check
semantics. In this case it uses the request-timeout configured at
web-module level. It would through an error page, which can be
customized error urls. For example - an error message mentioning that
user should retry after certain amount of time.

b. Active health check
In this Load Balancer can be configured to send a simple HTTP request to
an application deployed on the cluster. The application is known as the
health check application and user can choose to write it in a manner
that would suffice to notify the Load Balancer that instance is not
under the healthy window. By default this URL is "/" for out-of-box Load
Balancer deployment. Load Balancer associates three parameters to such
a custom application to identify the node/instance as being unhealthy
and these are :
   - request-timeout
     This is window within which a response should be received.
   - response status code
      500 or above raise error condition and signal Load Balancer to
remove the instance from active load balancing.
   - retry
     Mechanism to retry the health check request, ping request, before
confirming the failure/error scenario.

4. One can use self management feature of GlassFish while writing such a
health check application.

5. In conjunction with point 3. with the need for GlassFish to announce
that it can no longer accept new load; one would/should configure while
tuning the cluster deployment the http service connector parameters like
- no. of pipelined requests, no.of queued connections, etc. Please have
a look at "request-processing" and "keep-alive" element of http-service.
As part of the modeling exercise this helps the system from being
overloaded; also these can be adapted dynamically as well.

regards
Pankaj

glassfish_at_javadesktop.org wrote:
> So what is the purpose of a load balancer if cannot detect the load.
>
> Am I misunderstanding the reason LB exists ? Is it only for application server availability recovery ?
>
> I've seen cases in BEA when one server in the cluster because slower because the high processing volume yet the round-robin bases LB would deliver new requests to that very server making the recovery slower and sometime bringing the server down.
>
> The weighed description makes sense in case one server is better, hardware-wise, then then other. But that is not typical for big business.
> [Message sent by forum member 'theqmaster' (theqmaster)]
>
> http://forums.java.net/jive/thread.jspa?messageID=324618
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>
>