| Sun ONE Application Server 7, Enterprise Edition System Deployment Guide |
Chapter 2
Planning your EnvironmentPlanning your environment is one of the first phases of deployment. In this phase, you first decide your performance and availability goals. You then make decisions about the hardware, network, and storage requirements accordingly.
The main objective of this phase is to determine the environment that best meets your business requirements.
This chapter contains the following sections:
Introducing HADBSunTM Open Net Environment (Sun ONE) Application Server 7, Enterprise Edition supports persistence of HTTP sessions. The high-availability database (HADB) bundled with the Sun ONE Application Server works as the persistence store to provide high availability for applications.
An HADB node consists of a set of processes, a dedicated area of shared memory, and one or more secondary storage devices. It is used for storing and updating session data. There are two types of nodes:
Each active node must have a mirror node; therefore, active nodes occur in pairs. In addition, to maximize HADB availability, you should include two spare nodes for each pair so that if an active node fails, a spare node can take over while the failed node is repaired.
Data Redundancy Units
HADB nodes are organized into two Data Redundancy Units (DRUs) that mirror each other. Each DRU consists of half of the active nodes and spare nodes, and contains one complete copy of the data. To ensure fault tolerance, the computers that support one DRU must be completely self-supported with respect to power (use of uninterruptible power supplies is needed), processing units, and storage. If a power failure occurs in one DRU, the nodes in the other DRU can continue servicing requests until the power returns.
Machines that host HADB Nodes must be added in pairs, with one machine in each DRU.
Spare Nodes
A spare node is an additional HADB node connected to a DRU. A spare node initially does not contain data, but constantly monitors for failure of active nodes in the DRU. If an active node fails, the spare node takes over the functions of the failed node while the failed node is being repaired.
If you do not use spare nodes, if a machine goes down, the other node in the mirror takes over and continues service. However, the capacity reduces because that machine is no longer available to service requests. Depending on the impact of losing one machine, this may make your system effectively unavailable because the other machines become overloaded. Also, your system will be running without fault tolerance till you repair the machine because there is no mirror node to replicate the data. For high availability, it is required to minimize the time during which the system functions with only a single node.
Spare nodes allow a single machine to go down and still maintain overall level of service. Spare nodes are not mandatory, but should be used if you require high availability. You should allocate one machine for each DRU to act as a spare machine, so that if one of your machines goes down, your system can continue without getting overloaded. A spare node also makes it easy for you to perform planned maintenance on the machines that host the active nodes.
As a general rule, you should have a spare machine with enough application server instances and HADB nodes to replace any machine that becomes unavailable.
For example, if you have a co-located deployment with four Sun FireTM V480 servers, where each server has one application server instance and two HADB data nodes, you should allocate two more servers as spare machines (one machine per DRU). Each spare machine should run one application server instance and two spare HADB nodes.
As another example, suppose you have a separate tier deployment where the HADB tier has two Sun FireTM 280R servers, each running two HADB data nodes. If you want to maintain full capacity for this system even if a machine becomes unavailable, you should have one spare machine for the instances tier and one spare machine for the HADB tier. The spare machine for the instances tier should have as many instances as the other machines in the instances tier, and the spare machine for the HADB tier should have as many HADB nodes as the other machines in the HADB tier.
More information about the co-located and the separate tier deployment topologies is provided in Chapter 3, "Selecting a Topology"”.
Sample HADB Architecture
The following figure shows the architecture of a database with four active nodes and two spare nodes. Nodes 0 and 1 are a mirror node pair, as are nodes 2 and 3.
Figure 2-1 HADB Architecture with Four Active Nodes and Two Spare Nodes
Establishing Performance GoalsAs explained in Chapter 1, "Overview of Deployment"," one of your main goals in deployment is to maximize performance. This principally translates into maximizing throughput and reducing response time.
Beyond these basic goals, you should establish specific goals by determining the following information.
These factors are interrelated. If you know any three of these four pieces of information, you can always calculate the fourth.
Some of the metrics described in this chapter can be calculated using a remote browser emulator (RBE) tool, or web site performance and benchmarking software, that simulates your enterprises web application activity. Typically, RBE and benchmarking products generate concurrent HTTP requests and then report back the response time and number of requests per minute. You can then use these figures to calculate server activity.
The results of the calculations described in the following sections are not absolute. Treat them as reference points to work against as you fine-tune the performance of the Sun ONE Application Server.
Estimating Throughput
Throughput has different implications for application server instances and for the HADB.
A good measure of the throughput for application server instances is the requests per minute processed.
Similarly, a good measure of the throughput for the HADB is the requests per minute processed by the HADB and the session size per request. The session size per request is important because the amount of session data stored varies from request to request.
Estimating Load on Application Server Instances
To estimate the load on application server instances, consider the following factors:
Maximum Number of Concurrent Users
A user runs a process (for example through a web-browser) that periodically sends requests from a client machine to the Sun ONE Application Server 7, Enterprise Edition. When estimating the number of concurrent users, include all users currently active. A user is considered active as long as the session that user is running is active (for example, the session has not expired or been terminated).
A user is concurrent for as long as the user is on the system as a running process submitting requests, receiving results of requests from the server, and viewing the results of the requests.
Eventually, as the number of concurrent users submitting requests increases, requests processed per minute begin to decline (and the response time begins to increase). The following diagram illustrates this situation.
Figure 2-2 Performance Pattern with Increasing Number of Users.
You want to identify the point at which adding more concurrent users reduces the number of requests that can be processed per minute, as this indicates when performance starts to degrade.
Think Time
A user does not submit requests continuously. A user submits a request, the server receives the request, processes it and then returns a result, at which point the user spends some time analyzing the result before submitting a new request. This time spent reviewing the result of a request is called think time.
Determining typical think time length is important because you can use it to calculate more accurately the number of requests per minute and the number of concurrent users your system can support. Essentially, when a user is on the system but not submitting a request, a gap opens for another user to submit a request without altering system load. This also means that you can support more concurrent users.
Average Response Time
Response time refers to the amount of time it takes for request results to be returned to the user. The response time is affected by a number of factors including network bandwidth, number of users, number and type of requests submitted, and average think time. In this section, response time refers to mean, or average, response time. Each type of request has its own minimal response time, but when evaluating system performance, analyze based on the average response time of all requests.
The faster the response time, the more requests per minute are being processed. However, as the number of users on your system increases, response time starts to increase as well, even though the number of requests per minute declines, as the following diagram illustrates:
Figure 2-3 Response Time with Increasing Number of Users
A system performance graph like this indicates that after a certain point (point A in this diagram), requests per minute are inversely proportional to response time: the sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).
In the diagram, point A represents peak load, the point at which requests per minute start to decline. Prior to this, response time calculations are not necessarily accurate because they are not using peak numbers in the formula. After this point, because of the inversely proportional relationship between requests per minute and response time, you can more accurately calculate response time using the two criteria: maximum number of users and requests per minute.
To determine response time at peak load, use the following formula:
Response time = (concurrent users / requests per second) - think time in seconds
To obtain an accurate response time result, you must always include think time in the equation.
For example, if the following conditions exist:
The response time is, therefore, 2 seconds.
After you have calculated your system’s response time, particularly at peak load, decide what is an acceptable response time for your enterprise. Response time, along with throughput, is one of the factors critical to Sun ONE Application Server performance and improving it should be one of your goals. If there is a response time beyond which you do not want to wait, and performance is such that you get response times over that level, then work towards improving your response time or redefine your response time threshold.
Requests Per Minute
If you know the number of concurrent users at any given time, the response time of their requests and the average user think time at that time, you can determine requests per minute. Typically, you start by knowing how many concurrent users are on your system.
For example, after running some web site performance software, suppose you have calculated that the average number of concurrent users submitting requests on your online banking web site is 3,000. This is dependent on the number of users who have signed up to be members of your online bank, their banking transaction behavior, the times of the day or week they choose to submit requests, and so on. Therefore, knowing this information means you can use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Then, because requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute are acceptable as a trade-off for better response time, or alternately, if a slower response time is acceptable as a trade-off for more requests per minute.
Essentially, you start experimenting with the requests per minute and response time thresholds that you will accept as a starting point for fine-tuning system performance. Then you decide which areas of your system you want to adjust.
The requests per second formula is as follows:
requests per second = concurrent users / (response time in seconds + think time in seconds)
For example, if the following conditions exists:
Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.
Estimating Load on HADB
To calculate the load on HADB, consider the following factors:
The session persistence settings that you specify affect the load to the HADB. For more information on configuring session persistence, see the Sun ONE Application Server Administrator’s Guide.
Understanding Session Persistence
As an application session proceeds, there is often data that is part of the session that is not stored in a traditional database. An example of such data is the content of your shopping cart. Sun ONE Application Server provides the capability to save, or persist, this session state in a repository, so that if an application server instance experiences a failure, the session state can be recovered and the session can continue without information loss.
Apart from the number of requests being served by the Sun ONE Application Server, the session persistence configuration settings that you specify affect the number of requests received per minute by the HADB and the session information in each request.
The persistence settings can be defined for each application server instance. However, all application server instances in a particular cluster must have the same persistence configuration. If you are using more than one cluster, it is not necessary for all clusters to have the same persistence configuration settings.
Number of Requests per Minute Received by the HADB
The number of requests per minute received by the HADB depends on the persistence frequency, which is the frequency at which the session information is stored in the HADB. This is defined through the persistence frequency settings. The persistence frequency options are:
The web-method persistency frequency provides the highest guarantee that the persisted session information will be up to date, but it increases the traffic to the HADB. This is because information is being stored to the HADB at the end of every web request. The time-based frequency reduces the traffic to the HADB but it provides less of a guarantee as compared to the web-method persistence frequency that the session information will be up to date.
Comparison of Persistence Frequency Options
The table here summarizes the advantages and disadvantages of the persistence frequency options. The left column lists the persistence frequency option, the middle column lists the advantage(s) of the option, and the right column lists the disadvantage(s) of the option.
Session Size Per Request
The session size per request depends on how much session information is stored in the session.
Note
To improve overall performance, reduce the amount of information in the session as much as possible.
You can further fine-tune the session size per request through the persistence scope settings. You can choose from the following options for persistence scope:
- session: The entire session is saved every time session information is saved to the HADB database.
- modified-session: The session is saved only if it has been modified.
- modified-attribute: Only those attributes are stored that have been modified (inserted, updated, or deleted) since the last time the session was stored.
Comparison of Persistence Scope Options
The table here summarizes the advantages and disadvantages of the persistence scope options. The left column lists the persistence scope option, the middle column lists the advantage(s) of the option, and the right column lists the disadvantage(s) of the option.
Designing for Peak Load or Steady State Load
In a typical deployment, there is a difference between steady state and peak workloads.
If you design for peak load, you must deploy a system that can sustain the expected maximum load of users and requests without a degradation in response time. This means that your system can handle the extreme cases of expected system load. If the difference between peak load and steady state load is substantial, designing for peak loads may mean that you are spending on resources that will be idle for a significant amount of time.
If you design for steady state load, then you don’t have to deploy a system with all the resources required to handle the server’s expected peak load. However a system designed to support upto steady load will have slower response time when peak load occurs.
Frequency and Duration of Peak Load
A factor that may affect whether or not you want to design for peak load or for steady state is how often your system is expected to handle the peak load. If peak load occurs several times a day or even per week, you may decide that this is enough time to warrant expanding capacity to handle this load. If the system operates at steady state 90 percent of the time, and at peak only 10 percent of the time, then you may decide that you prefer deploying a system designed around steady state load. This means that 10 percent of the time your system s response time will be slower than the other 90 percent of the time. You decide if the frequency or duration of time that the system operates at peak justifies the need to add resources to your system, should this be required to handle peak load.
Design Decisions to Make
Depending on the load on the application server instances, the load on the HADB, and the failover requirements, here are the design decisions that you should make at this stage:
Number of Applications Server Instances Needed
To determine the number of applications server instances needed, evaluate your environment on the basis of the factors explained in "Estimating Load on Application Server Instances". Each application server instance can use more than one CPU and should have at least one CPU allocated to it.
Number of HADB Nodes Required
As a general guideline, you should plan to have one HADB node for each Central Processing Unit (CPU) in your system. For example, use two HADB nodes for a machine that has two CPUs.
HADB Storage Capacity Required
The HADB provides near-linear scaling by adding more nodes until network capacity is exceeded. Each node must be configured with storage devices on a dedicated disk or disks. All nodes must have equal space allocated on the storage devices. Make sure that the storage devices are allocated on local disks.
Suppose the expected session data is X MB. HADB replicates the data on mirror nodes, and, therefore, 2X MB of storage is needed.
Further, HADB uses indexes to enable fast access to data. An additional 2X MB is required (for both nodes together) for indexes and assuming a less than 100% fillings rate. This means that a storage capacity of 4X is required.
Therefore, the expected storage capacity needed by the HADB is four times the expected data volume.
If the system has to be designed for future expansion (by adding bigger disks to nodes or adding new nodes to the system) without loss of data from HADB, the expected storage capacity is eight times the expected data volume. This is because for online upgrades, you might want to refragment the data after adding new nodes. In that case, you will need a similar amount (4X) of additional space on the data devices, thus increasing the total storage capacity to 8X.
Additionally, HADB uses disk space for internal use as follows:
For more information, see the Sun ONE Application Server Administrator’s Guide and the Sun ONE Application Server Performance Tuning Guide.
The following table summarizes the HADB storage space requirements for a session data of X MB. The left column lists the condition (whether addition or removal of HADB nodes while online is required) and the right column lists the HADB storage space required.
If the HADB runs out of device space, error codes 4593 or 4592 are returned and error messages are written to the history file(s). For more information on these messages, see the Sun ONE Application Server Error Message Reference.
If the HADB runs out of device space, any client requests to insert or update data are not accepted. Delete operations are, however, accepted.
Setting Data Device Size
To set the size of the data device(s) of HADB, use the following command:
hadbm set TotalDatadeviceSizePerNode
The hadbm command restarts all the nodes, one by one, for the change to take effect. For more information, see the Sun ONE Application Server Administrator’s Guide.
Whether You Want to Design for Peak Load or for Steady State Load
While making a decision about whether to design for peak or steady state load, refer to the information provided in "Designing for Peak Load or Steady State Load".
Planning Network Configuration to Meet Your Performance GoalsWhen planning how to integrate Sun ONE Application Server into your network for optimal performance, you should estimate the bandwidth requirements and plan your network in a way that it can meet the performance requirements.
Estimating Bandwidth Requirements
As you decide on the desired size and bandwidth of your network, first determine your network traffic and identify its peak. See whether there is a particular hour, day of the week, or day of the month in which overall volume peaks, and then determine the duration of that peak.
At all times consult network experts at your site about the size and type of all network components you are considering adding.
Peak Load Times
During peak load times, the number of packets that are being sent is at its highest level. In general, if you design for peak load, scale your system with the goal of handling 100 percent of peak volume. Bear in mind, however, that any network behaves unpredictably and that despite your scaling efforts, 100 percent of peak volume might not always be handled.
For example, assume that at peak load, five percent of your users occasionally do not have immediate Internet access when accessing applications deployed on Sun ONE Application Server 7, Enterprise Edition. Of that five percent, determine how many users retry access after the first attempt. Again, not all of those users may get through, and of that unsuccessful portion, another percentage will retry. As a result, the peak appears longer because peak use is spread out over time as users continue to attempt access.
To ensure optimal access during the peak, start by verifying that your Internet service provider (ISP) has a backbone network connection that can reach an Internet hub without degradation.
Calculating Bandwidth Required
Depending on the calculations you made in "Establishing Performance Goals", you should determine the additional bandwidth required for the Sun ONE Application Server deployment on your site.
Depending on your method of access (T-1 lines, ISDN, and so on), you can calculate the amount of increased bandwidth you require to handle your estimated load. For example, suppose your site uses T-1 or the higher-speed T-3 links for Internet access. Given their bandwidth, you can estimate how many lines you will need on your network based on the average number of requests generated per second at your site and the maximum peak load. You can calculate these figures using a web site analysis- and monitoring-tool.
A single T-1 line can handle 1.544 Mbps. So a network of four T-1 lines carrying 1.544 Mbps each can handle approximately 6 Mbps of data. Assuming that the average HTML page sent back to a client is 30 kilobytes (KB), this network of four T-1 lines can handle the following traffic per second:
6,176,000 bits / 8 bits = 772,000 bytes per second
772,000 bytes per second / 30 KB = approximately 25 concurrent client requests for pages per second.
At traffic of 25 pages per second, this system can handle 90,000 pages per hour (25 x 60 seconds x 60 minutes), and therefore 2,160,000 pages per day maximum, assuming an even load throughout the day. If the maximum peak load is greater than this, you will want to increase the bandwidth accordingly.
Peak Load
Having an even load throughout the day is probably not realistic. You need to determine when peak load occurs, how long it lasts, and what percentage of the total load it is. For example, in the scenario outlined here, if peak load lasts for two hours and takes up 30 percent of the total load of 2,160,000 pages, this means that 648,000 pages must be carried over the T-1 lines during two hours of the day.
Therefore, to accommodate peak load during those two hours, you should increase the number of T-1 lines according to the following calculations:
648,000 pages / 120 minutes = 5,400 pages per minute
5,400 pages per minute / 60 seconds = 90 pages per second
If four lines can handle 25 pages per second, then approximately four times that many pages requires four times that many lines, in this case 16 lines. The 16 lines are meant for handling the realistic maximum of a 30 percent peak load. Obviously, the other 70 percent of your load can be handled throughout the rest of the day by these many lines.
Subnets
If you use the separate tier topology, in which the application server instances and HADB nodes are on separate tiers, you can achieve a performance improvement by keeping HADB nodes on a separate subnet. This is because HADB uses the User Datagram Protocol (UDP), and using a separate subnet reduces the UDP traffic on the machines outside of the subnet.
Network Cards
For greater bandwidth and optimal network performance, use at least 100 Mbps Ethernet cards or, preferably, 1 Gbps Ethernet cards between servers hosting the Sun ONE Application Server and the HADB nodes and also among any other resources such as HADB databases that are hosted on other machines.
Network Settings for HADB
Here are requirements and suggestions for HADB to work optimally in the network:
- Use switched routers so that each network interface has a dedicated 100 Mbps or better Ethernet channel.
- If you are running HADB on a multi-CPU machine hosting four or more HADB nodes, use 1 Gbps Ethernet cards.
- If you suspect network bottlenecks within HADB:
- The current release of HADB is not generally capable of running on computers with multiple network interface cards. If you need network bandwidth beyond what can be offered with a single network interface card per computer, consult Sun customer support for alternative solutions.
Planning AvailabilityAvailability must be planned according to the application and customer requirements.
There are two ways to achieve high availability:
Adding Redundancy to the System
One way to achieve high availability is to add redundancy to the system—redundancy of hardware and software. When one unit fails, the redundant unit takes over. This is also referred to as fault tolerance.
In general, to achieve high availability, you should determine and remove every possible point of failure in the system.
Failure Classes
The level of redundancy is determined by the failure classes (types of failure) that the system needs to tolerate. Some examples of failure classes are: system process, machine, power supply, disk, network failures, and building fires and catastrophes.
Duplicated system processes tolerate single system process failures. Duplicated machines tolerate single machine failures. Attaching the duplicated mirrored (paired) machines to different power supplies tolerates single power failures. By keeping the mirrored machines in separate buildings, a single building fire can be tolerated and by keeping them in separate geographical locations, natural catastrophes like earth quake in a location can be tolerated.
When planning availability, you should determine the failure classes covered by the system.
Using Redundancy Units to Improve Availability
To improve availability, HADB nodes are always used in Data Redundancy Units (DRUs) as explained in "Data Redundancy Units".
Using Spare Nodes to Improve Fault Tolerance
The use of spare nodes as explained in "Spare Nodes" improves fault tolerance. Although spare nodes are not mandatory, their use is recommended for maximum availability.
Planning Failover Capacity
Failover capacity planning means deciding how many additional servers and processes to add to your Sun ONE Application Server installation so that in the event of a server or process failure, the system can seamlessly recover data and continue processing. If your system gets overloaded, a process or server failure might result, causing response time degradation or even total loss of service. Preparing for such an occurrence is critical to a successful deployment.
To maintain capacity, especially at peak loads, it is recommended that you add spare machines that run application server instances to your Sun ONE Application Server installation. For example, assume you have a system with two machines running one applications server instance each. Together, these machines can handle a peak load of 300 requests per second. If one of these machines becomes unavailable, the system is able to handle only 150 requests, assuming an even load distribution between the machines. This means that half the requests during peak load are not being served.
Using Multiple Clusters to Improve Availability
To improve availability, instead of using a single cluster, you should group the application server instances into multiple clusters. This way, you can perform online upgrades for clusters (one by one) without loss of service.
For more information on setting up multiple clusters and using multiple clusters to perform online upgrades without loss of service, see the Sun ONE Application Server Administrator’s Guide.