Sun ONE logo      Previous      Contents      Index      Next     

Sun ONE Application Server 7, Enterprise Edition System Deployment Guide

Chapter 2
Planning your Environment

Planning your environment is one of the first phases of deployment. In this phase, you first decide your performance and availability goals. You then make decisions about the hardware, network, and storage requirements accordingly.

The main objective of this phase is to determine the environment that best meets your business requirements.

This chapter contains the following sections:


Introducing HADB

SunTM Open Net Environment (Sun ONE) Application Server 7, Enterprise Edition supports persistence of HTTP sessions. The high-availability database (HADB) bundled with the Sun ONE Application Server works as the persistence store to provide high availability for applications.

An HADB node consists of a set of processes, a dedicated area of shared memory, and one or more secondary storage devices. It is used for storing and updating session data. There are two types of nodes:

Each active node must have a mirror node; therefore, active nodes occur in pairs. In addition, to maximize HADB availability, you should include two spare nodes for each pair so that if an active node fails, a spare node can take over while the failed node is repaired.

Data Redundancy Units

HADB nodes are organized into two Data Redundancy Units (DRUs) that mirror each other. Each DRU consists of half of the active nodes and spare nodes, and contains one complete copy of the data. To ensure fault tolerance, the computers that support one DRU must be completely self-supported with respect to power (use of uninterruptible power supplies is needed), processing units, and storage. If a power failure occurs in one DRU, the nodes in the other DRU can continue servicing requests until the power returns.

Machines that host HADB Nodes must be added in pairs, with one machine in each DRU.

Spare Nodes

A spare node is an additional HADB node connected to a DRU. A spare node initially does not contain data, but constantly monitors for failure of active nodes in the DRU. If an active node fails, the spare node takes over the functions of the failed node while the failed node is being repaired.

If you do not use spare nodes, if a machine goes down, the other node in the mirror takes over and continues service. However, the capacity reduces because that machine is no longer available to service requests. Depending on the impact of losing one machine, this may make your system effectively unavailable because the other machines become overloaded. Also, your system will be running without fault tolerance till you repair the machine because there is no mirror node to replicate the data. For high availability, it is required to minimize the time during which the system functions with only a single node.

Spare nodes allow a single machine to go down and still maintain overall level of service. Spare nodes are not mandatory, but should be used if you require high availability. You should allocate one machine for each DRU to act as a spare machine, so that if one of your machines goes down, your system can continue without getting overloaded. A spare node also makes it easy for you to perform planned maintenance on the machines that host the active nodes.

As a general rule, you should have a spare machine with enough application server instances and HADB nodes to replace any machine that becomes unavailable.

For example, if you have a co-located deployment with four Sun FireTM V480 servers, where each server has one application server instance and two HADB data nodes, you should allocate two more servers as spare machines (one machine per DRU). Each spare machine should run one application server instance and two spare HADB nodes.

As another example, suppose you have a separate tier deployment where the HADB tier has two Sun FireTM 280R servers, each running two HADB data nodes. If you want to maintain full capacity for this system even if a machine becomes unavailable, you should have one spare machine for the instances tier and one spare machine for the HADB tier. The spare machine for the instances tier should have as many instances as the other machines in the instances tier, and the spare machine for the HADB tier should have as many HADB nodes as the other machines in the HADB tier.

More information about the co-located and the separate tier deployment topologies is provided in Chapter 3, "Selecting a Topology"”.

Sample HADB Architecture

The following figure shows the architecture of a database with four active nodes and two spare nodes. Nodes 0 and 1 are a mirror node pair, as are nodes 2 and 3.

Figure 2-1  HADB Architecture with Four Active Nodes and Two Spare Nodes

HADB architecture with four active nodes and two spare nodes


Establishing Performance Goals

As explained in Chapter 1, "Overview of Deployment"," one of your main goals in deployment is to maximize performance. This principally translates into maximizing throughput and reducing response time.

Beyond these basic goals, you should establish specific goals by determining the following information.

These factors are interrelated. If you know any three of these four pieces of information, you can always calculate the fourth.

Some of the metrics described in this chapter can be calculated using a remote browser emulator (RBE) tool, or web site performance and benchmarking software, that simulates your enterprises web application activity. Typically, RBE and benchmarking products generate concurrent HTTP requests and then report back the response time and number of requests per minute. You can then use these figures to calculate server activity.

The results of the calculations described in the following sections are not absolute. Treat them as reference points to work against as you fine-tune the performance of the Sun ONE Application Server.

Estimating Throughput

Throughput has different implications for application server instances and for the HADB.

A good measure of the throughput for application server instances is the requests per minute processed.

Similarly, a good measure of the throughput for the HADB is the requests per minute processed by the HADB and the session size per request. The session size per request is important because the amount of session data stored varies from request to request.

Estimating Load on Application Server Instances

To estimate the load on application server instances, consider the following factors:

Maximum Number of Concurrent Users

A user runs a process (for example through a web-browser) that periodically sends requests from a client machine to the Sun ONE Application Server 7, Enterprise Edition. When estimating the number of concurrent users, include all users currently active. A user is considered active as long as the session that user is running is active (for example, the session has not expired or been terminated).

A user is concurrent for as long as the user is on the system as a running process submitting requests, receiving results of requests from the server, and viewing the results of the requests.

Eventually, as the number of concurrent users submitting requests increases, requests processed per minute begin to decline (and the response time begins to increase). The following diagram illustrates this situation.

Figure 2-2  Performance Pattern with Increasing Number of Users.

Performance pattern with increasing number of users

You want to identify the point at which adding more concurrent users reduces the number of requests that can be processed per minute, as this indicates when performance starts to degrade.

Think Time

A user does not submit requests continuously. A user submits a request, the server receives the request, processes it and then returns a result, at which point the user spends some time analyzing the result before submitting a new request. This time spent reviewing the result of a request is called think time.

Determining typical think time length is important because you can use it to calculate more accurately the number of requests per minute and the number of concurrent users your system can support. Essentially, when a user is on the system but not submitting a request, a gap opens for another user to submit a request without altering system load. This also means that you can support more concurrent users.

Average Response Time

Response time refers to the amount of time it takes for request results to be returned to the user. The response time is affected by a number of factors including network bandwidth, number of users, number and type of requests submitted, and average think time. In this section, response time refers to mean, or average, response time. Each type of request has its own minimal response time, but when evaluating system performance, analyze based on the average response time of all requests.

The faster the response time, the more requests per minute are being processed. However, as the number of users on your system increases, response time starts to increase as well, even though the number of requests per minute declines, as the following diagram illustrates:

Figure 2-3  Response Time with Increasing Number of Users

Response time with increasing number of users

A system performance graph like this indicates that after a certain point (point A in this diagram), requests per minute are inversely proportional to response time: the sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).

In the diagram, point A represents peak load, the point at which requests per minute start to decline. Prior to this, response time calculations are not necessarily accurate because they are not using peak numbers in the formula. After this point, because of the inversely proportional relationship between requests per minute and response time, you can more accurately calculate response time using the two criteria: maximum number of users and requests per minute.

To determine response time at peak load, use the following formula:

Response time = (concurrent users / requests per second) - think time in seconds

To obtain an accurate response time result, you must always include think time in the equation.

For example, if the following conditions exist:

The response time is, therefore, 2 seconds.

After you have calculated your system’s response time, particularly at peak load, decide what is an acceptable response time for your enterprise. Response time, along with throughput, is one of the factors critical to Sun ONE Application Server performance and improving it should be one of your goals. If there is a response time beyond which you do not want to wait, and performance is such that you get response times over that level, then work towards improving your response time or redefine your response time threshold.

Requests Per Minute

If you know the number of concurrent users at any given time, the response time of their requests and the average user think time at that time, you can determine requests per minute. Typically, you start by knowing how many concurrent users are on your system.

For example, after running some web site performance software, suppose you have calculated that the average number of concurrent users submitting requests on your online banking web site is 3,000. This is dependent on the number of users who have signed up to be members of your online bank, their banking transaction behavior, the times of the day or week they choose to submit requests, and so on. Therefore, knowing this information means you can use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Then, because requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute are acceptable as a trade-off for better response time, or alternately, if a slower response time is acceptable as a trade-off for more requests per minute.

Essentially, you start experimenting with the requests per minute and response time thresholds that you will accept as a starting point for fine-tuning system performance. Then you decide which areas of your system you want to adjust.

The requests per second formula is as follows:

requests per second = concurrent users / (response time in seconds + think time in seconds)

For example, if the following conditions exists:

Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.

Estimating Load on HADB

To calculate the load on HADB, consider the following factors:

The session persistence settings that you specify affect the load to the HADB. For more information on configuring session persistence, see the Sun ONE Application Server Administrator’s Guide.

Understanding Session Persistence

As an application session proceeds, there is often data that is part of the session that is not stored in a traditional database. An example of such data is the content of your shopping cart. Sun ONE Application Server provides the capability to save, or persist, this session state in a repository, so that if an application server instance experiences a failure, the session state can be recovered and the session can continue without information loss.

Apart from the number of requests being served by the Sun ONE Application Server, the session persistence configuration settings that you specify affect the number of requests received per minute by the HADB and the session information in each request.

The persistence settings can be defined for each application server instance. However, all application server instances in a particular cluster must have the same persistence configuration. If you are using more than one cluster, it is not necessary for all clusters to have the same persistence configuration settings.


Note

You should use the cladmin command to ensure that the session persistence settings are homogeneous for all instances in the cluster. For more information on using the cladmin command, see the Sun ONE Application Server Administrator’s Guide.


Number of Requests per Minute Received by the HADB

The number of requests per minute received by the HADB depends on the persistence frequency, which is the frequency at which the session information is stored in the HADB. This is defined through the persistence frequency settings. The persistence frequency options are:

The web-method persistency frequency provides the highest guarantee that the persisted session information will be up to date, but it increases the traffic to the HADB. This is because information is being stored to the HADB at the end of every web request. The time-based frequency reduces the traffic to the HADB but it provides less of a guarantee as compared to the web-method persistence frequency that the session information will be up to date.

Comparison of Persistence Frequency Options

The table here summarizes the advantages and disadvantages of the persistence frequency options. The left column lists the persistence frequency option, the middle column lists the advantage(s) of the option, and the right column lists the disadvantage(s) of the option.

Table 2-1  Comparison of Persistence Frequency Options 

Persistence Frequency Option

Advantage(s)

Disadvantage(s)

web-method

Guarantees that the most up-to-date session information is available.

Potentially increased response time and reduced throughput.

time-based

Better response time and potentially better throughput.

Less guarantee than the web-method persistence frequency that the most updated session information is available after the failure of an application server instance.

Session Size Per Request

The session size per request depends on how much session information is stored in the session.


Note

To improve overall performance, reduce the amount of information in the session as much as possible.


You can further fine-tune the session size per request through the persistence scope settings. You can choose from the following options for persistence scope:

Comparison of Persistence Scope Options

The table here summarizes the advantages and disadvantages of the persistence scope options. The left column lists the persistence scope option, the middle column lists the advantage(s) of the option, and the right column lists the disadvantage(s) of the option.

Table 2-2  Comparison of Persistence Scope Options 

Persistence Scope Option

Advantage(s)

Disadvantage(s)

modified-session

Provides improved response time for requests that do not modify session state.

Your application must call the setAttribute method (if the attribute was changed) or the removeAttribute method (if the attribute was removed) on the session during the execution of a web method (typically doGet or doPost).

session

No constraint on applications.

Potentially poorer throughput and response time as compared to the modified-session and the modified-attribute options.

modified-attribute

Better throughput and response time for requests in which the percentage of session state modified is low.

1.  As the percentage of session state that gets modified for a given request grows to around 60%, the throughput and the response time degrade. In such cases, the performance gets worse than the session or modified-session persistence scope because of the overhead of splitting the attributes into separate records.

2.  Your application must be written to meet the following constraints required by this mode:

  • Call setAttribute or removeAttribute every time you modify the session state.
  • Make sure there are no cross references between attributes.
  • Distribute the session state across multiple attributes, or at least between a read-only attribute and a modifiable attribute.

Designing for Peak Load or Steady State Load

In a typical deployment, there is a difference between steady state and peak workloads.

If you design for peak load, you must deploy a system that can sustain the expected maximum load of users and requests without a degradation in response time. This means that your system can handle the extreme cases of expected system load. If the difference between peak load and steady state load is substantial, designing for peak loads may mean that you are spending on resources that will be idle for a significant amount of time.

If you design for steady state load, then you don’t have to deploy a system with all the resources required to handle the server’s expected peak load. However a system designed to support upto steady load will have slower response time when peak load occurs.

Frequency and Duration of Peak Load

A factor that may affect whether or not you want to design for peak load or for steady state is how often your system is expected to handle the peak load. If peak load occurs several times a day or even per week, you may decide that this is enough time to warrant expanding capacity to handle this load. If the system operates at steady state 90 percent of the time, and at peak only 10 percent of the time, then you may decide that you prefer deploying a system designed around steady state load. This means that 10 percent of the time your system s response time will be slower than the other 90 percent of the time. You decide if the frequency or duration of time that the system operates at peak justifies the need to add resources to your system, should this be required to handle peak load.

Design Decisions to Make

Depending on the load on the application server instances, the load on the HADB, and the failover requirements, here are the design decisions that you should make at this stage:

Number of Applications Server Instances Needed

To determine the number of applications server instances needed, evaluate your environment on the basis of the factors explained in "Estimating Load on Application Server Instances". Each application server instance can use more than one CPU and should have at least one CPU allocated to it.

Number of HADB Nodes Required

As a general guideline, you should plan to have one HADB node for each Central Processing Unit (CPU) in your system. For example, use two HADB nodes for a machine that has two CPUs.


Note

If you have more than one HADB node per machine (for example if you are using bigger machines), then you must ensure that there is enough redundancy and scalability on the machines—for example multiple uninterruptible power supplies, independent disk controllers, and so on.


HADB Storage Capacity Required

The HADB provides near-linear scaling by adding more nodes until network capacity is exceeded. Each node must be configured with storage devices on a dedicated disk or disks. All nodes must have equal space allocated on the storage devices. Make sure that the storage devices are allocated on local disks.

Suppose the expected session data is X MB. HADB replicates the data on mirror nodes, and, therefore, 2X MB of storage is needed.

Further, HADB uses indexes to enable fast access to data. An additional 2X MB is required (for both nodes together) for indexes and assuming a less than 100% fillings rate. This means that a storage capacity of 4X is required.

Therefore, the expected storage capacity needed by the HADB is four times the expected data volume.

If the system has to be designed for future expansion (by adding bigger disks to nodes or adding new nodes to the system) without loss of data from HADB, the expected storage capacity is eight times the expected data volume. This is because for online upgrades, you might want to refragment the data after adding new nodes. In that case, you will need a similar amount (4X) of additional space on the data devices, thus increasing the total storage capacity to 8X.

Additionally, HADB uses disk space for internal use as follows:

For more information, see the Sun ONE Application Server Administrator’s Guide and the Sun ONE Application Server Performance Tuning Guide.

The following table summarizes the HADB storage space requirements for a session data of X MB. The left column lists the condition (whether addition or removal of HADB nodes while online is required) and the right column lists the HADB storage space required.

Table 2-3  HADB Storage Space Requirement for Session Size of X MB

Condition

HADB Storage Space Required

Addition or removal of HADB nodes while online is not required.

(4X MB) + (4*logBufferSize) + (1% of Device Size)

Addition or removal of HADB nodes while online is required.

(8X MB) + (4*logBufferSize) + (1% of Device Size)

If the HADB runs out of device space, error codes 4593 or 4592 are returned and error messages are written to the history file(s). For more information on these messages, see the Sun ONE Application Server Error Message Reference.

If the HADB runs out of device space, any client requests to insert or update data are not accepted. Delete operations are, however, accepted.

Setting Data Device Size

To set the size of the data device(s) of HADB, use the following command:

hadbm set TotalDatadeviceSizePerNode

The hadbm command restarts all the nodes, one by one, for the change to take effect. For more information, see the Sun ONE Application Server Administrator’s Guide.


Note

The current version of the hadbm command does not add data devices to a running HADB database.


Whether You Want to Design for Peak Load or for Steady State Load

While making a decision about whether to design for peak or steady state load, refer to the information provided in "Designing for Peak Load or Steady State Load".


Planning Network Configuration to Meet Your Performance Goals

When planning how to integrate Sun ONE Application Server into your network for optimal performance, you should estimate the bandwidth requirements and plan your network in a way that it can meet the performance requirements.

Estimating Bandwidth Requirements

As you decide on the desired size and bandwidth of your network, first determine your network traffic and identify its peak. See whether there is a particular hour, day of the week, or day of the month in which overall volume peaks, and then determine the duration of that peak.

At all times consult network experts at your site about the size and type of all network components you are considering adding.

Peak Load Times

During peak load times, the number of packets that are being sent is at its highest level. In general, if you design for peak load, scale your system with the goal of handling 100 percent of peak volume. Bear in mind, however, that any network behaves unpredictably and that despite your scaling efforts, 100 percent of peak volume might not always be handled.

For example, assume that at peak load, five percent of your users occasionally do not have immediate Internet access when accessing applications deployed on Sun ONE Application Server 7, Enterprise Edition. Of that five percent, determine how many users retry access after the first attempt. Again, not all of those users may get through, and of that unsuccessful portion, another percentage will retry. As a result, the peak appears longer because peak use is spread out over time as users continue to attempt access.

To ensure optimal access during the peak, start by verifying that your Internet service provider (ISP) has a backbone network connection that can reach an Internet hub without degradation.

Calculating Bandwidth Required

Depending on the calculations you made in "Establishing Performance Goals", you should determine the additional bandwidth required for the Sun ONE Application Server deployment on your site.

Depending on your method of access (T-1 lines, ISDN, and so on), you can calculate the amount of increased bandwidth you require to handle your estimated load. For example, suppose your site uses T-1 or the higher-speed T-3 links for Internet access. Given their bandwidth, you can estimate how many lines you will need on your network based on the average number of requests generated per second at your site and the maximum peak load. You can calculate these figures using a web site analysis- and monitoring-tool.

A single T-1 line can handle 1.544 Mbps. So a network of four T-1 lines carrying 1.544 Mbps each can handle approximately 6 Mbps of data. Assuming that the average HTML page sent back to a client is 30 kilobytes (KB), this network of four T-1 lines can handle the following traffic per second:

6,176,000 bits / 8 bits = 772,000 bytes per second

772,000 bytes per second / 30 KB = approximately 25 concurrent client requests for pages per second.

At traffic of 25 pages per second, this system can handle 90,000 pages per hour (25 x 60 seconds x 60 minutes), and therefore 2,160,000 pages per day maximum, assuming an even load throughout the day. If the maximum peak load is greater than this, you will want to increase the bandwidth accordingly.

Peak Load

Having an even load throughout the day is probably not realistic. You need to determine when peak load occurs, how long it lasts, and what percentage of the total load it is. For example, in the scenario outlined here, if peak load lasts for two hours and takes up 30 percent of the total load of 2,160,000 pages, this means that 648,000 pages must be carried over the T-1 lines during two hours of the day.

Therefore, to accommodate peak load during those two hours, you should increase the number of T-1 lines according to the following calculations:

648,000 pages / 120 minutes = 5,400 pages per minute

5,400 pages per minute / 60 seconds = 90 pages per second

If four lines can handle 25 pages per second, then approximately four times that many pages requires four times that many lines, in this case 16 lines. The 16 lines are meant for handling the realistic maximum of a 30 percent peak load. Obviously, the other 70 percent of your load can be handled throughout the rest of the day by these many lines.

Subnets

If you use the separate tier topology, in which the application server instances and HADB nodes are on separate tiers, you can achieve a performance improvement by keeping HADB nodes on a separate subnet. This is because HADB uses the User Datagram Protocol (UDP), and using a separate subnet reduces the UDP traffic on the machines outside of the subnet.

Network Cards

For greater bandwidth and optimal network performance, use at least 100 Mbps Ethernet cards or, preferably, 1 Gbps Ethernet cards between servers hosting the Sun ONE Application Server and the HADB nodes and also among any other resources such as HADB databases that are hosted on other machines.

Network Settings for HADB

Here are requirements and suggestions for HADB to work optimally in the network:


Planning Availability

Availability must be planned according to the application and customer requirements.

There are two ways to achieve high availability:

Adding Redundancy to the System

One way to achieve high availability is to add redundancy to the system—redundancy of hardware and software. When one unit fails, the redundant unit takes over. This is also referred to as fault tolerance.

In general, to achieve high availability, you should determine and remove every possible point of failure in the system.

Failure Classes

The level of redundancy is determined by the failure classes (types of failure) that the system needs to tolerate. Some examples of failure classes are: system process, machine, power supply, disk, network failures, and building fires and catastrophes.

Duplicated system processes tolerate single system process failures. Duplicated machines tolerate single machine failures. Attaching the duplicated mirrored (paired) machines to different power supplies tolerates single power failures. By keeping the mirrored machines in separate buildings, a single building fire can be tolerated and by keeping them in separate geographical locations, natural catastrophes like earth quake in a location can be tolerated.

When planning availability, you should determine the failure classes covered by the system.

Using Redundancy Units to Improve Availability

To improve availability, HADB nodes are always used in Data Redundancy Units (DRUs) as explained in "Data Redundancy Units".

Using Spare Nodes to Improve Fault Tolerance

The use of spare nodes as explained in "Spare Nodes" improves fault tolerance. Although spare nodes are not mandatory, their use is recommended for maximum availability.

Planning Failover Capacity

Failover capacity planning means deciding how many additional servers and processes to add to your Sun ONE Application Server installation so that in the event of a server or process failure, the system can seamlessly recover data and continue processing. If your system gets overloaded, a process or server failure might result, causing response time degradation or even total loss of service. Preparing for such an occurrence is critical to a successful deployment.

To maintain capacity, especially at peak loads, it is recommended that you add spare machines that run application server instances to your Sun ONE Application Server installation. For example, assume you have a system with two machines running one applications server instance each. Together, these machines can handle a peak load of 300 requests per second. If one of these machines becomes unavailable, the system is able to handle only 150 requests, assuming an even load distribution between the machines. This means that half the requests during peak load are not being served.

Using Multiple Clusters to Improve Availability

To improve availability, instead of using a single cluster, you should group the application server instances into multiple clusters. This way, you can perform online upgrades for clusters (one by one) without loss of service.

For more information on setting up multiple clusters and using multiple clusters to perform online upgrades without loss of service, see the Sun ONE Application Server Administrator’s Guide.



Previous      Contents      Index      Next     


Copyright 2003 Sun Microsystems, Inc. All rights reserved.