Oracle® TimesTen In-Memory Database TimesTen to TimesTen Replication Guide Release 11.2.1 Part Number E13072-09 |
|
|
View PDF |
This chapter describes how to administer an active standby pair that does not replicate cache groups.
For information about administering active standby pairs that replicate cache groups, see Chapter 5, "Administering an Active Standby Pair with Cache Groups".
For information about managing failover and recovery automatically, see Chapter 7, "Using Oracle Clusterware to Manage Active Standby Pairs".
This chapter includes the following topics:
This section summarizes the possible states of a master database. These states are referenced in the tasks described in the rest of the chapter.
The master databases can be in one of the following states:
ACTIVE
- A database in this state is the active database. Applications can update its replicated tables.
STANDBY
- A database in this state is the standby database. Applications can update only nonreplicated tables in the standby database. Nonreplicated tables are tables that have been excluded from the replication scheme by using the EXCLUDE TABLE
or EXCLUDE CACHE GROUP
clauses of the CREATE ACTIVE STANDBY PAIR
statement.
FAILED
- A database in this state is a failed master database. No updates can be replicated to it.
IDLE
- A database in this state has not yet had its role in the active standby pair assigned. It cannot be updated. Every database comes up in the IDLE
state.
RECOVERING
- When a previously failed master database is synchronizing updates with the active database, it is in the RECOVERING
state.
You can use the ttRepStateGet
built-in procedure to discover the state of a master database.
When you set up an replication scheme or administer a recovery, a common task is duplicating a database. You can use the ttRepAdmin
-duplicate
utility or the ttRepDuplicateEx
C function to duplicate a database.
To duplicate a database, these conditions must be fulfilled:
The instance administrator performs the duplicate operation.
The instance administrator user name must be the same on both instances involved in the duplication.
You must provide the user name and password for a user with the ADMIN
privilege on the source database.
The target DSN cannot include client/server attributes.
On the source database, create a user and grant the ADMIN
privilege to the user:
CREATE USER ttuser IDENTIFIED BY ttuser; User created. GRANT admin TO ttuser;
Assume the user name of the instance administrator is timesten
. Logged in as timesten
on the target host, duplicate database dsn1
on host1
to dsn2
:
ttRepAdmin -duplicate -from dsn1 -host host1 dsn2 Enter internal UID at the remote datastore with ADMIN privileges: ttuser Enter password of the internal Uid at the remote datastore:
Enter ttuser
when prompted for the password of the internal user at the remote database.
If you are duplicating an active database that has cache groups, use the -keepCG
option. You must also specify the cache administration user ID and password with the -cacheUid
and -cachePwd
options. If you do not provide the cache administration user password, ttRepAdmin
prompts for a password. If the cache administration user ID is orauser
and the password is orapwd
, duplicate database dsn1
on host1
:
ttRepAdmin -duplicate -from dsn1 -host host1 -keepCG "DSN=dsn2;UID=;PWD=" Enter internal UID at the remote datastore with ADMIN privileges: ttuser Enter password of the internal Uid at the remote datastore:
Enter ttuser
when prompted for the password. ttRepAdmin
then prompts for the cache administration user and password:
Enter cache administrator UID: orauser Enter cache administrator password:
Enter orapwd
when prompted for the cache administration password.
The UID
and PWD
for dsn2
are specified as null values in the connection string so that the connection is made as the current OS user, which is the instance administrator. Only the instance administrator can run ttRepAdmin -duplicate
. If dsn2
is configured with PWDCrypt
instead of PWD
, then the connection string should be "DSN=dsn2;UID=;PWDCrypt="
.
When you duplicate a standby database with cache groups to a read-only subscriber, use the -nokeepCG
option. In this example, dsn2
is the standby database and sub1
is the read-only subscriber:
ttRepAdmin -duplicate -from dsn2 -host host2 -nokeepCG "DSN=sub1;UID=;PWD="
ttRepAdmin
prompts for values for -uid
and -pwd
.
For more information about the ttRepAdmin
utility, see "ttRepAdmin" in Oracle TimesTen In-Memory Database Reference. For more information about the ttRepDuplicateEx
C function, see "ttRepDuplicateEx" in Oracle TimesTen In-Memory Database C Developer's Guide.
To set up an active standby pair, complete the tasks in this section. See "Configuring an active standby pair with one subscriber" for an example.
If you intend to replicate read-only cache groups or asynchronous writethrough (AWT) cache groups, see Chapter 5, "Administering an Active Standby Pair with Cache Groups".
Create a database.
Create the replication scheme using the CREATE ACTIVE STANDBY PAIR
statement. See Chapter 3, "Defining an Active Standby Pair Replication Scheme".
Start the replication agent. See "Starting and stopping the replication agents".
Call ttRepStateSet
('ACTIVE')
on the active database.
Create a user on the active database and grant the ADMIN
privilege to the user.
Duplicate the active database to the standby database.
Start the replication agent on the standby database. See "Starting and stopping the replication agents".
Wait for the standby database to enter the STANDBY
state. Use the ttRepStateGet
procedure to check the state of the standby database.
Duplicate all of the subscribers from the standby database. See "Copying a master database to a subscriber".
Set up the replication agent policy and start the replication agent on each of the subscriber databases. See "Starting and stopping the replication agents".
This section includes the following topics:
This section describes how to recover the active database when the standby database is available and synchronized with the active database. It includes the following topics:
Complete the following tasks:
Stop the replication agent on the failed database if it has not already been stopped.
On the standby database, execute ttRepStateSet
('ACTIVE')
. This changes the role of the database from STANDBY
to ACTIVE
.
On the new active database, execute ttRepStateSave
('FAILED', '
failed_database
','
host_name
')
, where failed_database
is the former active database that failed. This step is necessary for the new active database to replicate directly to the subscriber databases. During normal operation, only the standby database replicates to the subscribers.
Destroy the failed database.
Duplicate the new active database to the new standby database.
Set up the replication agent policy and start the replication agent. See "Starting and stopping the replication agents".
The standby database contacts the active database. The active database stops sending updates to the subscribers. When the standby database is fully synchronized with the active database, then the standby database enters the STANDBY
state and starts sending updates to the subscribers. If you are replicating an AWT cache group, the new standby database takes over processing of the cache group automatically when it enters the STANDBY
state.
Note:
You can verify that the standby database has entered theSTANDBY
state by using the ttRepStateGet
built-in procedure.Complete the following tasks:
On the standby database, execute ttRepStateSet
('ACTIVE')
. This changes the role of the database from STANDBY
to ACTIVE
.
On the new active database, execute ttRepStateSave
('FAILED', '
failed_database
','
host_name
')
, where failed_database
is the former active database that failed. This step is necessary for the new active database to replicate directly to the subscriber databases. During normal operation, only the standby database replicates to the subscribers.
Connect to the failed database. This triggers recovery from the local transaction logs. If database recovery fails, you must continue from Step 5 of the procedure for recovering when replication is return receipt or asynchronous. See "When replication is return receipt or asynchronous".
Verify that the replication agent for the failed database has restarted. If it has not restarted, then start the replication agent. See "Starting and stopping the replication agents".
When the active database determines that it is fully synchronized with the standby database, then the standby database enters the STANDBY
state and starts sending updates to the subscribers. If you are replicating an AWT cache group, the new standby database takes over processing of the cache group automatically when it enters the STANDBY
state.
Note:
You can verify that the standby database has entered theSTANDBY
state by using the ttRepStateSet
built-in procedure.Consider the following scenarios:
The standby database fails. The active database fails before the standby comes back up or before the standby has been synchronized with the active database.
The active database fails. The standby database becomes ACTIVE
, and the rest of the recovery process begins. (See "Recovering from a failure of the active database".) The new active database fails before the new standby database is fully synchronized with it.
In both scenarios, the subscribers may have had more changes applied than the standby database.
When the active database fails and the standby database has not applied all of the changes that were last sent from the active database, there are two choices for recovery:
Recover the active database from the local transaction logs.
Recover the standby database from the local transaction logs.
The choice depends on which database is available and which is more up to date.
Connect to the failed active database. This triggers recovery from the local transaction logs.
Verify that the replication agent for the failed active database has restarted. If it has not restarted, then start the replication agent. See "Starting and stopping the replication agents".
Execute ttRepStateSet
('ACTIVE')
on the newly recovered database.
Continue with Step 6 in "Setting up an active standby pair with no cache groups".
Connect to the failed standby database. This triggers recovery from the local transaction logs.
If the replication agent for the standby database has automatically restarted, you must stop the replication agent. See "Starting and stopping the replication agents".
Drop the replication configuration using the DROP ACTIVE STANDBY PAIR
statement.
Re-create the replication configuration using the CREATE ACTIVE STANDBY PAIR
statement.
Set up the replication agent policy and start the replication agent. See "Starting and stopping the replication agents".
Execute ttRepStateSet
('ACTIVE')
on the master database, giving it the ACTIVE
role.
Continue from Step 6 in "Setting up an active standby pair with no cache groups".
After a successful failover, you may wish to fail back so that the active database and the standby database are on their original nodes. See "Reversing the roles of the active and standby databases" for instructions.
To recover from a failure of the standby database, complete the following tasks:
Detect the standby database failure.
If return twosafe service is enabled, the failure of the standby database may prevent a transaction in progress from being committed on the active database, resulting in error 8170, "Receipt or commit acknowledgement not returned in the specified timeout interval". If so, then call the ttRepSyncSet
procedure with a localAction
parameter of 2
(COMMIT
) and commit the transaction again. For example:
call ttRepSyncSet( null, null, 2); commit;
Execute ttRepStateSave
('FAILED','
standby_database
','
host_name
')
on the active database. After this, as long as the standby database is unavailable, updates to the active database are replicated directly to the subscriber databases. Subscriber databases may also be duplicated directly from the active.
If the replication agent for the standby database has automatically restarted, stop the replication agent. See "Starting and stopping the replication agents".
Recover the standby database in one of the following ways:
Connect to the standby database. This triggers recovery from the local transaction logs.
Duplicate the standby database from the active database.
The amount of time that the standby database has been down and the amount of transaction logs that need to be applied from the active database determine the method of recovery that you should use.
Set up the replication agent policy and start the replication agent. See "Starting and stopping the replication agents".
The standby database enters the STANDBY
state after the active database determines that the two master databases have been synchronized.
Note:
You can verify that the standby database has entered theSTANDBY
state by using the ttRepStateGet
procedure.If a subscriber database fails, then you can recover it by one of the following methods:
Connect to the failed subscriber. This triggers recovery from the local transaction logs. Start the replication agent and let the subscriber catch up.
Duplicate the subscriber from the standby database.
If the standby database is down or in recovery, then duplicate the subscriber from the active database.
After the subscriber database has been recovered, then set up the replication agent policy and start the replication agent. See "Starting and stopping the replication agents".
To change the role of the active database to standby and vice versa:
Pause any applications that are generating updates on the current active database.
Execute ttRepSubscriberWait
on the active database, with the DSN and host of the current standby database as input parameters. This ensures that all updates have been transmitted to the current standby database.
Stop the replication agent on the current active database. See "Starting and stopping the replication agents".
Execute ttRepDeactivate
on the current active database. This puts the database in the IDLE
state.
Execute ttRepStateSet
('ACTIVE')
on the current standby database. This database now acts as the active database in the active standby pair.
Set up the replication agent policy and start the replication agent on the old active database. Use the ttRepStateGet
procedure to determine when the database's state has changed from IDLE
to STANDBY
. The database now acts as the standby database in the active standby pair.
Resume any applications that were paused in Step 1.
Ordinarily, the designation of the active and standby databases in an active standby pair is explicitly controlled by the user. However, in some circumstances the user may not have the ability to modify both the active and standby databases when changing the role of the standby database to active.
For example, if network communication to the site of an active database is interrupted, the user may need the standby database at a different site to take over the role of the active, but cannot stop replication on the current active or change its role manually. Changing the standby database to active without first stopping replication on the active leads to a situation where both masters are in the ACTIVE
state and accepting transactions. In such a scenario, TimesTen can automatically negotiate the active/standby role of the master databases when network communication between the databases is restored.
If, during the initial handshake between the databases, TimesTen determines that the master databases in an active standby pair replication scheme are both in the ACTIVE
state, TimesTen performs the following operations automatically:
The database which was set to the ACTIVE
state most recently is left in the ACTIVE
state and may continue to be connected to and updated by applications.
The database which was set to the ACTIVE
state least recently is invalidated. All applications are disconnected.
When the invalidated database comes back up, TimesTen determines whether any transactions have occurred on the database that have not yet been replicated to the other master database. If such transactions have occurred, they are now trapped, and the database is left in the IDLE
state. The database needs to be duplicated from the active in order to become a standby. If there are no trapped transactions, the database is safe to use as a standby database and is automatically set to the STANDBY
state.