1 Disaster Recovery Introduction

This chapter provides an introduction to the Oracle Fusion Middleware Disaster Recovery solution.

It contains the following topics:

1.1 Disaster Recovery Overview

This section provides an overview of Oracle Fusion Middleware Disaster Recovery.

It contains the following topics:

1.1.1 Problem Description and Common Solutions

Providing Maximum Availability Architecture is one of the key requirements for any Oracle Fusion Middleware enterprise deployment. Oracle Fusion Middleware includes an extensive set of high availability features such as: process death detection and restart, server clustering, server migration, clusterware integration, GridLink, load balancing, failover, backup and recovery, rolling upgrades, and rolling configuration changes, which protect an Enterprise Deployment from unplanned down time and minimize planned downtime.

Additionally, enterprise deployments need protection from unforeseen disasters and natural calamities. One protection solution involves setting up a standby site at a geographically different location than the production site. The standby site may have equal or fewer services and resources compared to the production site. Application data, metadata, configuration data, and security data are replicated to the standby site on a periodic basis. The standby site is normally in a passive mode; it is started when the production site is not available. This deployment model is sometimes referred to as an active/passive model. This model is normally adopted when the two sites are connected over a WAN and network latency does not allow clustering across the two sites.

A core strategy for and a key feature of Oracle Fusion Middleware is hot-pluggability. Built for the heterogeneous enterprise, Oracle Fusion Middleware consists of modular component software that runs on a range of popular platforms and interoperates with middleware technologies and business applications from other software vendors such as IBM, Microsoft, and SAP. For instance, Oracle Fusion Middleware products and technologies such as ADF, Oracle BPEL Process Manager, Oracle Enterprise Service Bus, Oracle Web Services Manager, Adapters, Oracle Access Manager, Oracle Identity Manager, Rules, Oracle TopLink, and Oracle Business Intelligence Publisher can run on non-Oracle containers such as IBM Websphere and JBoss, in addition to running on the Oracle WebLogic Server container.

The Oracle Fusion Middleware Disaster Recovery solution uses storage replication technology for disaster protection of Oracle Fusion Middleware middle tier components. It supports hot-pluggable deployments, and it is compatible with third party vendor recommended solutions.

Disaster protection for Oracle databases that are included in your Oracle Fusion Middleware is provided through Oracle Data Guard.

This document describes how to deploy the Oracle Fusion Middleware Disaster Recovery solution for enterprise deployments on Linux and UNIX operating systems, making use of storage replication technology and Oracle Data Guard technology.

1.1.2 Terminology

This section defines the following Disaster Recovery terminology:

  • asymmetric topology: An Oracle Fusion Middleware Disaster Recovery configuration that is different across tiers on the production site and standby site. For example, an asymmetric topology can include a standby site with fewer hosts and instances than the production site. Section 4.4, "Creating an Asymmetric Standby Site" describes how to create asymmetric topologies.

  • disaster: A sudden, unplanned catastrophic event that causes unacceptable damage or loss. A disaster is an event that compromises an organization's ability to provide critical functions, processes, or services for some unacceptable period of time and causes the organization to invoke its recovery plans.

  • Disaster Recovery: The ability to safeguard against natural or unplanned outages at a production site by having a recovery strategy for applications and data to a geographically separate standby site.

  • alias host name: This guide differentiates between the terms alias host name and physical host name.

    The alias host name is an alternate way to access the system besides its real network name. Typically, it resolves to the same IP address as the network name of the system. This can be defined in the name resolution system such as DNS, or locally in the local hosts file on each system. Multiple alias host names can be defined for a given system.

    See also the physical host name definition later in this section.

  • physical host name: The physical host name is the host name of the system as returned by the gethostname() call or the hostname command. Typically, the physical host name is also the network name used by clients to access the system. In this case, an IP address is associated with this name in the DNS (or the given name resolution mechanism in use) and this IP is enabled on one of the network interfaces to the system.

    A given system typically has one physical host name. It can also have one or more additional network names, corresponding to IP addresses enabled on its network interfaces, that are used by clients to access it over the network. Further, each network name can be aliased with one or more alias host names.

    See also the alias host name definition earlier in this section.

  • production site setup: The process of creating the production site. To create the production site using the procedure described in this manual, you must plan and create physical host names and alias host names, create mount points and symbolic links (if applicable) on the hosts to the Oracle home directories on the shared storage where the Oracle Fusion Middleware instances will be installed, install the binaries and instances, and deploy the applications. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.

  • site failover: The process of making the current standby site the new production site after the production site becomes unexpectedly unavailable (for example, due to a disaster at the production site). This book also uses the term "failover" to refer to a site failover.

  • site switchback: The process of reverting the current production site and the current standby site to their original roles. Switchbacks are planned operations done after the switchover operation has been completed. A switchback restores the original roles of each site: the current standby site becomes the production site and the current production site becomes the standby site. This book also uses the term "switchback" to refer to a site switchback.

  • site switchover: The process of reversing the roles of the production site and standby site. Switchovers are planned operations done for periodic validation or to perform planned maintenance on the current production site. During a switchover, the current standby site becomes the new production site, and the current production site becomes the new standby site. This book also uses the term "switchover" to refer to a site switchover.

  • site synchronization: The process of applying changes made to the production site at the standby site. For example, when a new application is deployed at the production site, you should perform a synchronization so that the same application will be deployed at the standby site, also.

  • standby site setup: The process of creating the standby site. To create the standby site using the procedure described in this manual, you must plan and create physical host names and alias host names, and create mount points and symbolic links (if applicable) to the Oracle home directories on the standby shared storage. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.

  • symmetric topology: An Oracle Fusion Middleware Disaster Recovery configuration that is completely identical across tiers on the production site and standby site. In a symmetric topology, the production site and standby site have the identical number of hosts, load balancers, instances, and applications. The same ports are used for both sites. The systems are configured identically and the applications access the same data. This manual describes how to set up a symmetric Oracle Fusion Middleware Disaster Recovery topology for an enterprise configuration.

  • topology: The production site and standby site hardware and software components that comprise an Oracle Fusion Middleware Disaster Recovery solution.

1.2 Disaster Recovery for Oracle Fusion Middleware Components

This section provides an introduction to setting up Disaster Recovery for a common Oracle Fusion Middleware enterprise deployment.

It contains the following topics:

1.2.1 Oracle Fusion Middleware Disaster Recovery Architecture Overview

This section describes the deployment architecture for Oracle Fusion Middleware components.

The product binaries and configuration for Oracle Fusion Middleware components and applications gets deployed in Oracle home directories on the middle tier. Additionally, most of the products also have metadata or run-time data stored in a database repository.

Therefore, the Oracle Fusion Middleware Disaster Recovery solution keeps middle tier file system data and middle tier data stored in databases at the production site synchronized with the standby site.

The Oracle Fusion Middleware Disaster Recovery solution supports these methods of providing data protection for Oracle Fusion Middleware data and database content:

  • Oracle Fusion Middleware product binaries, configuration, and metadata files

    Use storage replication technologies.

  • Database content

    Use Oracle Data Guard for Oracle databases (and vendor-recommended solutions for third party databases).

Figure 1-1 shows an overview of an Oracle Fusion Middleware Disaster Recovery topology:

Figure 1-1 Production and Standby Site for Oracle Fusion Middleware Disaster Recovery Topology

Description of Figure 1-1 follows
Description of "Figure 1-1 Production and Standby Site for Oracle Fusion Middleware Disaster Recovery Topology"

Some of the key aspects of the solution in Figure 1-1 are:

  • The solution has two sites. The current production site is running and active, while the second site is serving as a standby site and is in passive mode.

  • Hosts on each site have mount points defined for accessing the shared storage system for the site.

  • On both sites, the Oracle Fusion Middleware components are deployed on the site's shared storage system. This involves creating all the Oracle home directories, which include product binaries and configuration data for middleware components, in volumes on the production site's shared storage and then installing the components into the Oracle home directories on the shared storage. In Figure 1-1, a separate volume is created in the shared storage for each Oracle Fusion Middleware host cluster (note the Web, Application, and Security volumes created for the Web Cluster, Application Cluster, and Security Cluster in each site's shared storage system).

  • Mount points must be created on the shared storage for the production site. The Oracle Fusion Middleware software for the production site will be installed into Oracle home directories using the mount points on the production site shared storage. Symbolic links may also need to be set up on the production site hosts to the Oracle Fusion Middleware home directories on the shared storage at the production site. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.

  • Mount points must be created on the shared storage for the standby site. Symbolic links also need to be set up on the standby site hosts to the Oracle Fusion Middleware home directories on the shared storage at the standby site. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links. The mount points and symbolic links for the standby site hosts must be identical to those set up for the equivalent production site hosts.

  • Storage replication technology is used to copy the middle tier file systems and other data from the production site's shared storage to the standby site's shared storage.

  • After storage replication is enabled, application deployment, configuration, metadata, data, and product binary information is replicated from the production site to the standby site.

  • It is not necessary to perform any Oracle software installations at the standby site hosts. When the production site storage is replicated at the standby site storage, the equivalent Oracle home directories and data are written to the standby site storage.

  • Schedule incremental replications at a specified interval. The recommended interval is once a day for the production deployment, where the middle tier configuration does not change very often. Additionally, you should force a manual synchronization whenever you make a change to the middle tier configuration at the production site (for example, if you deploy a new application at the production site). Some Oracle Fusion Middleware components generate data on the file system, which may require more frequent replication based on recovery point objectives. Please refer to Chapter 2, "Recommendations for Fusion Middleware Components" for detailed Disaster Recovery recommendations for Oracle Fusion Middleware components.

  • Before forcing a manual synchronization, you should take a snapshot of the site to capture its current state. This ensures that the snapshot gets replicated to the standby site storage and can be used to roll back the standby site to a previous synchronization state, if desired. Recovery to the point of the previously successful replication (for which a snapshot was created) is possible when a replication fails.

  • Oracle Data Guard is used to replicate all Oracle database repositories, including Oracle Fusion Middleware repositories and custom application databases. For information about using Oracle Data Guard to provide disaster protection for Oracle databases, see Section 3.3, "Database Considerations."

  • If your Oracle Fusion Middleware Disaster Recovery topology includes any third party databases, use the vendor-recommended solution for those databases.

  • User requests are initially routed to the production site.

  • When there is a failure or planned outage of the production site, you perform the following steps to enable the standby site to assume the production role in the topology:

    1. Stop the replication from the production site to the standby site (when a failure occurs, replication may have already been stopped due to the failure).

    2. Perform a failover or switchover of the Oracle databases using Oracle Data Guard.

    3. Start the services and applications on the standby site.

    4. Use a global load balancer to re-route user requests to the standby site. At this point, the standby site has assumed the production role.

1.2.2 Components Described in this Document

The Oracle Fusion Middleware Disaster Recovery solution supports components from various Oracle product suites, including:

  • Oracle WebLogic Server components:

    See Section 2.1, "Recommendations for Oracle WebLogic Server" for Disaster Recovery recommendations for Oracle WebLogic Server components.

  • Oracle ADF

    See Section 2.2, "Recommendations for Oracle ADF" for Disaster Recovery recommendations for Oracle Application Development Framework (Oracle ADF).

  • Oracle WebCenter components:

    • Oracle WebCenter Spaces

    • Oracle WebCenter Portlets

    • Oracle WebCenter Discussions Server

    • Oracle WebCenter Wiki and Blog Server

    See Section 2.3, "Recommendations for Oracle WebCenter" for Disaster Recovery recommendations for Oracle WebCenter components.

  • Oracle SOA Suite components:

    • Oracle SOA Service Infrastructure

    • Oracle BPEL Process Manager

    • Oracle Mediator

    • Oracle Human Workflow

    • Oracle B2B

    • Oracle Web Services Manager

    • Oracle User Messaging Service

    • Oracle JCA Adapters

    • Oracle Business Activity Monitoring

    • Oracle Business Process Management

    See Section 2.4, "Recommendations for Oracle SOA Suite" for Disaster Recovery recommendations for Oracle SOA Suite components.

  • Oracle Identity Management components:

    • Oracle Internet Directory

    • Oracle Virtual Directory

    • Oracle Directory Integration Platform

    • Oracle Identity Federation

    • Oracle Directory Services Manager

    • Oracle Access Manager

    • Oracle Adaptive Access Manager

    • Oracle Identity Manager

    • Oracle Authorization Policy Manager

    • Oracle Identity Navigator

    See Section 2.5, "Recommendations for Oracle Identity Management" for Disaster Recovery recommendations for Oracle Identity Management components.

  • Oracle Portal, Forms, Reports, and Business Intelligence Discoverer components:

    • Oracle Portal

    • Oracle Forms

    • Oracle Reports

    • Oracle Business Intelligence Discoverer (Discoverer)

    See Section 2.6, "Recommendations for Oracle Portal, Forms, Reports, and Discoverer" for Disaster Recovery recommendations for these components.

  • Oracle Web Tier components:

    • Oracle HTTP Server

    • Oracle Web Cache

    See Section 2.7, "Recommendations for Oracle Web Tier Components" for Disaster Recovery recommendations for Oracle Web Tier components.

  • Oracle Enterprise Content Management:

    • Oracle Universal Content Management

    • Oracle Inbound Refinery

    • Oracle Imaging and Process Management

    • Oracle Information Rights Management

    • Oracle Universal Records Management

    See Section 2.8, "Recommendations for Oracle Enterprise Content Management" for Disaster Recovery recommendations for Oracle Enterprise Content Management components.