27 Fast Connection Failover

The Fast Connection Failover mechanism depends on the implicit connection cache feature. As a result, for Fast Connection Failover to be available, implicit connection caching must be enabled.

This chapter is divided into the following sections:

Introduction

Fast Connection Failover offers a driver-independent way for your Java Database Connectivity (JDBC) application to take advantage of the connection failover facilities offered by Oracle Database 10g. The advantages of Fast Connection Failover include:

  • Driver independence

    Fast Connection Failover supports both the JDBC Thin and JDBC Oracle Call Interface (OCI) drivers.

  • Integration with implicit connection cache

    The two features work together synergistically to improve application performance and high availability.

  • Integration with Oracle Real Application Clusters (Oracle RAC)

    This provides superior Real Application Clusters/High Availability event notification mechanisms.

  • Easy integration with application code

    You only need to enable Fast Connection Failover and no further configuration is required.

Fast Connection Failover Features

When enabled, Fast Connection Failover provides:

  • Rapid detection and cleanup of invalid cached connections, that is, DOWN event processing

  • Load balancing of available connections, that is, UP event processing

  • Run-time work request distribution to all active Oracle RAC instances

Using Fast Connection Failover

Applications manage Fast Connection Failover through DataSource instances.

This section covers the following topics:

Fast Connection Failover Prerequisites

Fast Connection Failover is available under the following circumstances:

  • The implicit connection cache is enabled

    Fast Connection Failover works in conjunction with the JDBC connection caching mechanism. This helps applications manage connections to ensure high availability.

  • The application uses service names to connect to the database

    The application cannot use service identifiers (SIDs).

  • The underlying database has Oracle Database 10g Real Application Clusters capability

    If failover events are not propagated, then connection failover cannot occur.

  • Oracle Notification Service (ONS) is configured and available on the node where JDBC is running

    JDBC depends on ONS to propagate database events and notify JDBC of them.

  • The Java virtual machine (JVM) in which your JDBC instance is running must have oracle.ons.oraclehome set to point to your ORACLE_HOME.

Configuring ONS For Fast Connection Failover

In order for Fast Connection Failover to work, you must configure ONS correctly. ONS is shipped as part of Oracle Database 10g.

This section covers the following topics:

ONS Configuration File

ONS configuration is controlled by the ONS configuration file, ORACLE_HOME/opmn/conf/ons.config. This file tells the ONS daemon details about how it should behave and who it should talk to. Configuration information within ons.config is defined in simple name/value pairs. There are three values that should always be configured within ons.config. The first is localport, the port that ONS binds to on the localhost interface to talk to local clients. An example of the localport configuration is:

localport=4100

The second value is remoteport, the port that ONS binds to on all interfaces for talking to other ONS daemons. An example of the remoteport configuration is:

remoteport=4200

The third value specifies nodes, a list of other ONS daemons to talk to. Node values are given as a comma-delimited list of either hostnames or IP addresses plus ports. Note that the port value that is given is the remote port that each ONS instance is listening on. In order to maintain an identical file on all nodes, the host:port of the current ONS node can also be listed in the nodes list. It will be ignored when reading the list.

The nodes listed in the nodes line correspond to the individual nodes in the Oracle RAC instance. Listing the nodes ensures that the middle-tier node can communicate with the Oracle RAC nodes. At least one mid-tier node and one node in the Oracle RAC instance must be configured to see one another. As long as one node on each side is aware of the other, all nodes are visible. You need not list every single cluster and middle-tier node in the ONS config file of each Oracle RAC node. In particular, if one ONS config file cluster node is aware of the middle tier, then all nodes in the cluster are aware of it.

An example of the nodes configuration is:

nodes=myhost.example.com:4200,123.123.123.123:4200

There are also several optional values that can be provided in ons.config.The first optional value is a loglevel. This specifies the level of messages that should be logged by ONS. This value is an integer that ranges from 1, which indicates least messages logged, to 9, which indicates most messages logged. The default value is 3. An example is:

loglevel=3

The second optional value is a logfile name. This specifies a log file that ONS should use for logging messages. The default value for logfile is $ORACLE_HOME/opmn/logs/ons.log. An example is:

logfile=/private/oraclehome/opmn/logs/myons.log

The third optional value is a walletfile name. A wallet file is used by the Oracle Secure Sockets Layer (SSL) to store SSL certificates. If a wallet file is specified to ONS, it will use SSL when communicating with other ONS instances and require SSL certificate authentication from all ONS instances that try to connect to it. This means that if you want to turn on SSL for one ONS instance, then you must turn it on for all instances that are connected. This value should point to the directory where your ewallet.p12 file is located. An example is:

walletfile=/private/oraclehome/opmn/conf/ssl.wlt/default

One optional value is reserved for use on the server side. useocr=on is used to tell ONS to store all Oracle RAC nodes and port numbers in Oracle Cluster Registry (OCR) instead of in the ONS configuration file. Do not use this option on the client side.

The ons.config file allows blank lines and comments on lines that begin with #.

Client-Side ONS Configuration

You can access the client-side ONS through ORACLE_HOME/opmn. On the client side, there are two ways to set up ONS:

Example 27-1 illustrates how a sample configuration file may look like.

Example 27-1 ons.config file

# This is an example ons.config file
#
# The first three values are required
localport=4100
remoteport=4200
nodes=racnode1.example.com:4200,racnode2.example.com:4200

After configuring ONS, you start the ONS daemon with the onsctl command. It is the user's responsibility to make sure that an ONS daemon is running at all times.

Using the onsctl Command

After configuring, use ORACLE_HOME/opmn/bin/onsctl to start, stop, reconfigure, and monitor the ONS daemon. Table 27-1 is a summary of the commands that onsctl supports.

Table 27-1 onsctl commands

Command Effect Output

start

Starts the ONS daemon

onsctl: ons started

stop

Stops the ONS daemon

onsctl: shutting down ons daemon...

ping

Verifies whether the ONS daemon is running

ons is running ...

reconfig

Triggers a reload of the ONS configuration without shutting down the ONS daemon

 

help

Prints a help summary message for onsctl

 

detailed

Prints a detailed help message for onsctl

 

Server-Side ONS Configuration Using racgons

You can access the server-side ONS through ORA_CRS_HOME/opmn. You configure the server side by using racgons to add the middle-tier node information to OCR. This command is found in ORA_CRS_HOME/bin/racgons. Before using racgons, you must edit ons.config to set useocr=on.

The middle-tier nodes should be configured in OCR, so that all nodes share the configuration, and no matter which Oracle RAC nodes are up they can communicate to the mid-tier. When running on a cluster, always configure the ONS hosts and ports not by using the ONS configuration files but using racgons. The racgons command stores the ONS hosts and ports in OCR, where every node can see it. That way, you don't need to edit a file on every node to change the configuration, just run a single command on one of the cluster nodes.

The racogns command enables you to specify hosts and ports on one node, then propagate your changes among all nodes in a cluster. The command takes two forms:

racgons add_config hostname:port [hostname:port] [hostname:port] ...
racgons remove_config hostname[:port] [hostname:port] [hostname:port] ...

The add_config version adds the listed hostname(s), the remove_config version removes them. Both commands propagate the changes among all instances in a cluster.

If multiple port numbers are configured for a host, the specified port number is removed from hostname. If only hostname is specified, all port numbers for that host are removed.

Other Uses of racgons

You should run racgons whenever you add a new node to the cluster.

Remote ONS Subscription

The advantages of remote ONS subscription are:

  • Support for an All Java mid-tier stack

  • No ONS daemon needed on the client computer and, therefore, no need to manage this process

  • Simple configuration using the DataSource property.

When using remote ONS subscription for Fast Connection Failover, the application invokes the following method on an OracleDataSource instance:

setONSConfiguration(String remoteONSConfig)

The remoteONSConfig parameter is a list of name/value pairs of the form name=value that are separated by a new line character (\n). name can be one of nodes, walletfile, or walletpassword. This parameter should specify at least the nodes ONS configuration attribute, which is a list of host:port pairs separated by comma (,). The hosts and ports denote the remote ONS daemons available on the Oracle RAC nodes.

SSL could be used in communicating with the ONS daemons when the walletfile attribute is specified as an Oracle wallet file. In such cases, if the walletpassword attribute is not specified, single sign-on (SSO) would be assumed.

Following are a few examples, assuming ods is an OracleDataSource instance:

ods.setONSConfiguration("nodes=racnode1.example.com:4200,racnode2.example.com:4200");

ods.setONSConfiguration("nodes=racnode1:4200,racnode2:4200\nwalletfile=/mydir/Wallet\nwalletpassword=mypasswd");

ods.setONSConfiguration("nodes=racnode1:4200,racnode2:4200\nwalletfile=/mydir/conf/Wallet");

Note:

The ons.jar must be in the CLASSPATH on the client. In the case of Oracle Application Server, ONS is embedded in OPMN, as before, and JDBC Fast Connection Failover continues to work as before.

Enabling Fast Connection Failover

An application enables Fast Connection Failover by calling setFastConnectionFailoverEnabled(true) on a DataSource instance, before retrieving any connections from that instance.

You cannot enable Fast Connection Failover when reinitializing a connection cache. You must enable it before using the OracleDataSource instance.

Example 27-2 illustrates how to enable Fast Connection Failover.

Note:

After a cache is Fast Connection Failover-enabled, you cannot disable Fast Connection Failover during the lifetime of that cache.

To enable Fast Connection Failover, you must:

  • Configure and start ONS. If ONS is not correctly set up, then implicit connection cache creation fails and an ONSException is thrown at the first getConnection request.

  • Set the FastConnectionFailoverEnabled property before making the first getConnection request to an OracleDataSource. When Fast Connection Failover is enabled, the failover applies to all connections in the connection cache. If your application explicitly creates a connection cache using the Connection Cache Manager, then you must first set FastConnectionFailoverEnabled before retrieving any connections.

  • Use a service name rather than an SID when setting the OracleDataSource url property.

Example 27-2 Enabling Fast Connection Failover

// declare datasource
ods.setUrl(
"jdbc:oracle:oci:@(DESCRIPTION=
  (ADDRESS=(PROTOCOL=TCP)(HOST=cluster_alias)
    (PORT=1521))
    (CONNECT_DATA=(SERVICE_NAME=service_name)))");
ods.setUser("scott");
ods.setConnectionCachingEnabled(true);
ods.setFastConnectionFailoverEnabled(true):
ctx.bind("myDS",ods);
ds=(OracleDataSource) ctx.lookup("MyDS");
try {
 ds.getConnection();  // transparently creates and accesses cache
 catch (SQLException SE {
  }
}
...

Querying Fast Connection Failover Status

An application determines whether Fast Connection Failover is enabled by calling OracleDataSource.getFastConnectionFailoverEnabled, which returns true if failover is enabled, false otherwise.

Understanding Fast Connection Failover

After Fast Connection Failover is enabled, the mechanism is automatic; no application intervention is needed. This section discusses how a connection failover is presented to an application and what steps the application takes to recover.

This section covers the following topics:

What The Application Sees

When an Oracle RAC service failure is propagated to the JDBC application, the database has already rolled back the local transaction. The cache manager then cleans up all invalid connections. When an application holding an invalid connection tries to do work through that connection, it receives SQLException, ORA-17008, Closed Connection.

When an application receives a Closed Connection error message, it should:

  1. Retry the connection request. This is essential, because the old connection is no longer open.

  2. Replay the transaction. All work done before the connection was closed has been lost.

Note:

The application should not try to roll back the transaction. The transaction was already rolled back in the database by the time the application received the exception.

How It Works

Under Fast Connection Failover, each connection in the cache maintains a mapping to a service, instance, database, and hostname.

When a database generates an Oracle RAC event, that event is forwarded to the JVM in which JDBC is running. A daemon thread inside the JVM receives the Oracle RAC event and passes it on to the Connection Cache Manager. The Connection Cache Manager then throws SQL exceptions to the applications affected by the Oracle RAC event.

A typical failover scenario may work like this:

  1. A database instance fails, leaving several stale connections in the cache.

  2. The Oracle RAC mechanism in the database generates an Oracle RAC event which is sent to the JVM containing JDBC.

  3. The daemon thread inside the JVM finds all the connections affected by the Oracle RAC event, notifies them of the closed connection through SQL exceptions, and rolls back any open transactions.

  4. Each individual connection receives a SQL exception and must retry.

Comparison of Fast Connection Failover and TAF

Fast Connection Failover differs from Transparent Application Failover (TAF) in the following ways:

  • Application-level connection retries

    Fast Connection Failover supports application-level connection retries. This gives the application control of responding to connection failovers. The application can choose whether to retry the connection or to rethrow the exception. TAF supports connection retries only at the OCI/Net layer.

  • Integration with the implicit connection cache

    Fast Connection Failover is well-integrated with the implicit connection cache, which allows the connection cache manager to manage the cache for high availability. For example, failed connections are automatically invalidated in the cache. TAF works at the network level on a per-connection basis, which means that the connection cache cannot be notified of failures.

  • Event-based

    Fast Connection Failover is based on the Oracle RAC event mechanism. This means that Fast Connection Failover is efficient and detects failures quickly for both active and inactive connections.

  • Load-balancing support

    Fast Connection Failover supports UP event load balancing of connections and run-time work request distribution across active Oracle RAC instances.

Note:

Oracle recommends not to use TAF and Fast Connection Failover in the same application.