Skip Headers

Oracle® High Availability Architecture and Best Practices
10g Release 1 (10.1)

Part Number B10726-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

10
Detailed Recovery Steps

This chapter describes the detailed recovery operations that are referred to in the outages and solutions tables in Chapter 9, "Recovering from Outages". It includes the following topics:

Summary of Recovery Operations

This chapter describes the detailed recovery operations that are referred to in the outages and solutions tables in Chapter 9, "Recovering from Outages". Table 10-1 summarizes the recovery operations that are described in this chapter.

Table 10-1 Recovery Operations  
Recovery Operation Description

Complete or Partial Site Failover

In the complete site failover scenario, existing connections fail and new connections are routed to a secondary or failover site. This occurs when there is a true disaster and where the application stack is replicated.

In the partial site failover scenario, the primary site is intact, and the middle-tier applications need to be redirected after the database has failed over or switched over to a standby database on the secondary site. This configuration is not recommended if performance decreases dramatically because of the greater latency between the application servers and the database.

Database Failover

A Data Guard failover is invoked in the database tier because of an unscheduled outage. If complete recovery is attempted, then there is minimal or no data loss. A subsequent complete or partial site failover must occur. The previous production database can be converted to a standby database by flashing back the database.

Database Switchover

A Data Guard switchover occurs in the database tier. The previous production database becomes the new standby database while the previous standby database becomes the new production database. A subsequent complete or partial site failover must occur. This is a scheduled or planned outage.

RAC Recovery

RAC automatically handles instance and node failures at a given site to provide continued access to the backend database. The reconnection to available instances can be transparent from an application standpoint and occurs only on the primary site. The application service migrates to the available or designated instances.

Apply Instance Failover

When a standby node requires maintenance, you can switch to another standby instance to avoid any impact on the production database and to ensure that the standby recovery does not fall behind. When the standby cluster requires maintenance or the standby cluster fails, if the maximum protection mode is enabled, then the production database needs to be downgraded to maximum availability mode or maximum performance mode.

Application failover

The client or application tier automatically fails over to one or more surviving RAC instances when an instance or node failure occurs at the primary site.

See Also: "Recommendations for Fast Application Failover"

Recovery Solutions for Data Failures

When data failure occurs due to media corruption or media damage, you can use different recovery options that use the flash recovery area or switch over to the standby database. Alternatively, you can rebuild tables by fast imports or rebuild indexes in parallel.

Recovering from User Error with Flashback Technology

When a user error causes transactional or logical data inconsistencies, you can resolve these problems by using flashback error correction technology at the row, transaction, table, or database levels.

RAC Rolling Upgrade

With RAC, you can apply patches for specific customer issues incrementally to one node or instance at a time, which enables continual application and database availability.

Upgrade with Logical Standby Database

Data Guard enables you to upgrade the database on the standby database and perform a switchover, which minimizes the overall scheduled outage time for applying patch sets or software upgrades for the database.

Online Object Reorganization

Many scheduled outages related to the data server involve some reorganization of the database objects. They need to be accomplished with continued availability of the database. Oracle online object reorganization is used to manage the scheduled outages.

Complete or Partial Site Failover

This section describes the following client failover scenarios:

In the complete site failover scenario, existing connections fail and new connections are routed to a secondary or failover site. This occurs when there is a true disaster and where the application stack is replicated.

In the partial site failover scenario, the primary site is intact, and the middle-tier applications need to be redirected after the database has failed over or switched over to a standby database on the secondary site. This configuration is not recommended if performance decreases significantly because of the greater latency between the application servers and the database.

Complete Site Failover

A wide-area traffic manager is implemented on the primary and secondary sites to provide the site failover function. The wide-area traffic manager can redirect traffic automatically if the primary site or a specific application on the primary site is not accessible. It can also be triggered manually to switch to the secondary site for switchovers. Traffic is directed to the secondary site only when the primary site cannot provide service due to an outage or after a switchover. If the primary site fails, then user traffic is directed to the secondary site.

Figure 10-1 illustrates the network routes before site failover. Client requests enter the client tier of the primary site and travel by the WAN traffic manager. Client requests are sent through the firewall into the demilitarized zone (DMZ) to the application server tier. Requests are then forwarded through the active load balancer to the application servers. They are then sent through another firewall and into the database server tier. The application requests, if required, are routed to a RAC instance. Responses are sent back to the application and clients by a similar path.

Figure 10-1 Network Routes Before Site Failover

Text description of maxav013.gif follows

Text description of the illustration maxav013.gif

Figure 10-2 illustrates the network routes after site failover. Client or application requests enter the secondary site at the client tier and follow exactly the same path on the secondary site that they followed on the primary site.

Figure 10-2 Network Routes After Site Failover

Text description of maxav028.gif follows

Text description of the illustration maxav028.gif

The following steps describe what happens to network traffic during a failover or switchover.

  1. The administrator has failed over or switched over the production database to the secondary site.
  2. The administrator starts the middle-tier application servers on the secondary site.
  3. Typically, a Domain Name Service (DNS) administrator changes the wide-area traffic manager selection of the secondary site. Alternatively, the selection can be made automatically for an entire site failure.The wide-area traffic manager at the secondary site returns the virtual IP address of a load balancer at the secondary site. In this scenario, the site failover is accomplished by a DNS failover. The following is an example of a manual DNS failover:
    1. Change the DNS to point to the secondary site load balancer.
    2. Set TTL (Time to Live) to a short interval for the DNS propagation.
    3. Disable DNS on the primary site.
    4. Execute a DNS "push" to point to the secondary site.
    5. Wait until all failover operations are complete.
    6. Change TTL back to its normal setting on the DNS server.
    7. The secondary site load balancer directs traffic to the secondary site middle-tier application server.
    8. The secondary site is ready to take client requests.

Failover also depends on the client's web browser. Most browser applications cache the DNS entry for a period of time. Consequently, sessions in progress during an outage may not fail over until the cache timeout expires. The only way to resume service to such clients is to close the browser and restart it.

Partial Site Failover: Middle-Tier Applications Connect to a Remote Database Server

This usually occurs after the database has been failed over or switched over to the secondary site and the middle-tier applications remain on the primary site. The following steps describe what happens to network traffic during a partial site failover:

  1. The production database is failed over or switched over to the secondary site.
  2. The middle-tier application servers reconnect to the database on the secondary site using configuration best practices described in "Recommendations for Fast Application Failover".

Figure 10-3 shows the network routes after partial site failover. Client and application requests enter the primary site at the client tier and follow the same path to the database server tier as in Figure 10-1. When the requests enter the database server tier, they are routed to the database tier of the secondary site through any additional switches, routers, and possible firewalls.

Figure 10-3 Network Routes After Partial Site Failover

Text description of maxav031.gif follows

Text description of the illustration maxav031.gif

Database Failover

Failover is the operation of taking the production database offline on one site and bringing one of the standby databases online as the new production database. A failover operation can be invoked when an unplanned catastrophic failure occurs on the production database, and there is no possibility of recovering the production database in a timely manner.

Data Guard enables you to fail over by issuing the SQL statements described in subsequent sections, by using Oracle Enterprise Manager, or by using the Oracle Data Guard broker command-line interface.

See Also:

Oracle Data Guard Broker for information about using Enterprise Manager or the Data Guard broker command-line for database failover

Data Guard failover is a series of steps to convert a standby database into a production database. The standby database essentially assumes the role of production. A Data Guard failover is accompanied by a site failover to fail over the users to the new site and database. After the failover, the secondary site contains the production database. The former production database needs to be re-created as a new standby database to restore resiliency. The standby database can be quickly re-created by using Flashback Database. See "Restoring the Standby Database After a Failover".

During a failover operation, little or no data loss may be experienced. The complete description of a failover can be found in Oracle Data Guard Concepts and Administration.

The rest of this section includes the following topics:

When to Use Data Guard Failover

Data Guard failover should be used only in the case of an emergency and should be initiated due to an unplanned outage such as:

A failover requires that the initial production database be re-created as a standby database to restore fault tolerance to your environment. The standby database can be quickly re-created by using Flashback Database. See "Restoring the Standby Database After a Failover".

When Not to Use Data Guard Failover

Do not use Data Guard failover when the problem can be fixed locally in a timely manner or when Data Guard switchover can be used. For failover with complete recovery scenarios, either the production database is not accessible or cannot be restarted. Data Guard failover should not be used where object recovery or flashback technology solutions provide a faster and more efficient alternative.

Data Guard Failover Using SQL*Plus

This section includes the following topics:

Physical Standby Failover Using SQL*Plus

  1. Check for archive gaps.
        SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# 
        FROM V$ARCHIVE_GAP;
    
    See Also:

    Oracle Data Guard Concepts and Administration for more information about what to do if a gap exists

  2. Shut down other standby instances.
  3. Finish recovery.
        ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
    
    
  4. Check the database state. Query switchover status readiness by executing the following statement:
        SELECT SWITCHOVER_STATUS FROM V$DATABASE;
    
    
  5. Convert the physical standby database to the production role.
        ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
    
    
  6. Restart instance.

Logical Standby Failover Using SQL*Plus

  1. Stop the current SQL Apply session.
        ALTER DATABASE STOP LOGICAL STANDBY APPLY;
    
    
  2. If there are additional logs to be registered, (for example, you can get to the primary database or you are using LGWR to the standby destination), then register the log files.
        ALTER DATABASE REGISTER LOGICAL LOGFILE 'file_name';
    
    
  3. Start the SQL Apply session using the NODELAY and FINISH clauses.
        ALTER DATABASE START LOGICAL STANDBY APPLY NODELAY FINISH;
    
    
  4. Activate the logical standby database.
        ALTER DATABASE ACTIVATE LOGICAL STANDBY DATABASE;
    
    

Failover has completed, and the new production database is available to process transactions.

Database Switchover

A database switchover performed by Oracle Data Guard is a planned transition that includes a series of steps to switch roles between a standby database and a production database. Thus, following a successful switchover operation, the standby database assumes the production role and the production database becomes a standby database. In a RAC environment, a switchover requires that only one instance is active for each database, production and standby. At times the term "switchback" is also used within the scope of database role management. A switchback operation is a subsequent switchover operation to return the roles to their original state.

Data Guard enables you to change these roles dynamically by issuing the SQL statements described in subsequent sections, or by using Oracle Enterprise Manager, or by using the Oracle Data Guard broker command-line interface. Using Oracle Enterprise Manager or the Oracle Data Guard broker command-line interface is described in Oracle Data Guard Broker.

This section includes the following topics:

When to Use Data Guard Switchover

Switchover is a planned operation. Switchover is the capability to switch database roles between the production and standby databases without needing to instantiate any of the databases. Switchover can occur whenever a production database is started, the target standby database is available, and all the archived redo logs are available. It is useful in the following situations:

When Not to Use Data Guard Switchover

Switchover is not possible or practical under the following circumstances:

Do not use Data Guard switchover when local recovery solutions provide a faster and more efficient alternative. The complete description of a switchover can be found in Oracle Data Guard Concepts and Administration.

Data Guard Switchover Using SQL*Plus

If you are not using Oracle Enterprise Manager, then the high-level steps in this section can be executed with SQL*Plus. These steps are described in detail in Oracle Data Guard Concepts and Administration.

This section includes the following topics:

Physical Standby Switchover Using SQL*Plus

  1. Shut down all production and standby instances except one for each site.
  2. Stop active sessions on remaining active production instance

    To identify active sessions, execute the following query:

        SELECT SID, PROCESS, PROGRAM 
        FROM V$SESSION 
        WHERE TYPE = 'USER' 
        AND SID <> (SELECT DISTINCT SID FROM V$MYSTAT);
    
    
  3. Check that the switchover status on the production database is 'TO STANDBY'.
        SELECT SWITCHOVER_STATUS FROM V$DATABASE;
    
    
  4. Switch over the current production database to the standby database.
        ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY [WITH SESSION SHUTDOWN];
    
    
  5. Start the new standby database.
        STARTUP MOUNT;
        RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
    
    
  6. Convert the former standby database to a production database.
        ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY [WITH SESSION SHUTDOWN];
    
    
  7. Restart all instances.

Logical Standby Switchover Using SQL*Plus

  1. Prepare the production database to become the logical standby database.
        ALTER DATABASE PREPARE TO SWITCHOVER TO LOGICAL STANDBY;
    
    
  2. Prepare the logical standby database to become the production database.

    Following this step, logs start to ship in both directions, although the current production database does not process the logs coming from the current logical standby database.

        ALTER DATABASE PREPARE TO SWITCHOVER TO PRIMARY;
    
    
  3. Commit the production database to become the logical standby database.

    This is the phase where current transactions on the production database are cancelled. All DML-related cursors are invalidated, preventing new records from being applied. The end of redo (EOR) marker is recorded in the online redo log and then shipped (immediately if using real-time apply) to the logical standby database and registered.

        ALTER DATABASE COMMIT TO SWITCHOVER TO LOGICAL STANDBY;
    
    
  4. Commit the logical standby database to become the production database.
        ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
    
    
  5. Start the logical standby apply engine on the new logical standby database.

    If real-time apply is required, execute the following statement:

        ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;
    
    

    Otherwise execute the following statement:

        ALTER DATABASE START LOGICAL STANDBY APPLY;
    

RAC Recovery

This section includes the following topics:

RAC Recovery for Unscheduled Outages

This section includes the following topics:

Automatic Instance Recovery for Failed Instances

Instance failure occurs when software or hardware problems disable an instance. After instance failure, Oracle automatically uses the online redo log file to perform database recovery as described in this section.

Single Node Failure in Real Application Clusters

Instance recovery in RAC does not include restarting the failed instance or the recovery of applications that were running on the failed instance. Applications that were running continue by using failure recognition and recovery as described in Oracle Real Application Clusters Installation and Configuration Guide. This provides consistent and uninterrupted service in the event of hardware or software failures. When one instance performs recovery for another instance, the surviving instance reads redo log entries generated by the failed instance and uses that information to ensure that committed transactions are recorded in the database. Thus, data from committed transactions is not lost. The instance that is performing recovery rolls back uncommitted transactions that were active at the time of the failure and releases resources used by those transactions.

Multiple Node Failures in Real Application Clusters

When multiple node failures occur, as long as one instance survives, RAC performs instance recovery for any other instances that fail. If all instances of a RAC database fail, then Oracle automatically recovers the instances the next time one instance opens the database. The instance that is performing recovery can mount the database in either shared or exclusive mode from any node of a RAC database. This recovery procedure is the same for Oracle running in shared mode as it is for Oracle running in exclusive mode, except that one instance performs instance recovery for all the failed instances in exclusive node.

Automatic Service Relocation

Service reliability is achieved by configuring and failing over among redundant instances. More instances are enabled to provide a service than would otherwise be needed. If a hardware failure occurs and adversely affects a RAC database instance, then RAC automatically moves any services on that instance to another available instance. Then Cluster Ready Services (CRS) attempts to restart the failed nodes and instances.

An installation can specify the "preferred" and "available" configuration for each service. This configuration describes the preferred way to run the system, and is used when the service first starts up. For example, the ERP service runs on instance1 and instance2, and the HR service runs on instance3 when the system first starts. instance2 is available to run HR in the event of a failure or planned outage, and instance3 and instance4 are available to run ERP. The service configuration can be designed several ways.

RAC recognizes when a failure affects a service and automatically fails over the service and redistributes the clients across the surviving instance supporting the service. In parallel, CRS attempts to restart and integrate the failed instances and dependent resources back into the system. Notification of failures occur at various levels including notifying external parties through Enterprise Manager and callouts, recording the fault for tracking, event logging, and interrupting applications. Notification occurs from a surviving fault domain when the failed domain is out of service. The location and number of fault domains serving a service is transparent to the applications. Auto restart and recovery are automatic, including all the subsystems, not just database.

RAC Recovery for Scheduled Outages

This section includes the following topics:

Disabling CRS-Managed Resources

When an outage occurs, RAC automatically restarts essential components. Components that are eligible for automatic restart include instances, listeners, and the database as well as several subcomponents. Some scheduled administrative tasks require that you prevent components from automatically restarting. To perform scheduled maintenance that requires a CRS-managed component to be down during the operation, the resource must be disabled to prevent CRS from trying to automatically restart the component. For example, to take a node and all of its instances and services offline for maintenance purposes, disable the instance and its services using either Enterprise Manager or SRVCTL, and then perform the required maintenance. Otherwise, if the node fails and then restarts, then CRS attempts to restart the instance during the administrative operation.

Planned Service Relocation

For a scheduled outage that requires an instance, node, or other component to be isolated, RAC provides the ability to relocate, disable, and enable services. Relocation migrates the service to another instance. The sessions can also be relocated. These interfaces also allow services, instances and databases to be selectively disabled while a repair, change, or upgrade is made and re-enabled after the change is complete. This ensures that the service is not started at the instance being repaired because of a dependency or a start operation on the service. The service is disabled on the instance at the beginning of the planned outage. It is then enabled at the end of the maintenance outage.

For example, to relocate the SALES service from instance1 to instance3 in order to perform scheduled maintenance on node1,the tasks can be performed using Enterprise Manager or SRVCTL commands. The following shows how to use SRVCTL commands:

  1. Relocate the SALES service to instance3.
        srvctl relocate service -d PROD -s SALES -i instance1 -t instance3
    
    
  2. Disable the SALES service on instance1 to prevent it from being relocated to instance1 while maintenance is performed.
        srvctl disable service -d PROD -s SALES -i instance1
    
    
  3. Stop instance1.
        srvctl stop instance -d PROD -i instance1
    
    
  4. Perform the scheduled maintenance.
  5. Start instance1.
        srvctl start instance -D PROD -i instance1
    
    
  6. Re-enable the SALES service on instance1.
        srvctl enable service -d PROD -s SALES -i instance1
    
    
  7. If desired, relocate the SALES service running on instance3 back to instance1.
        srvctl relocate service -d PROD -s SALES -i instance3 -t instance1
    
    Note Also:

    Oracle Real Application Clusters Administrator's Guide

Apply Instance Failover

This section applies to MAA, with RAC and Data Guard on each site.

A standby database can have multiple standby instances. Only one instance can have the managed recovery process (MRP) or the logical standby apply process (LSP). The instance with the MRP or LSP is called the apply instance.

When you have a RAC-enabled standby database, you can fail over the apply instance of the standby RAC environment. Failing over to another apply instance may be necessary when incurring a planned or unplanned outage that affects the apply instance or node. Note the difference between apply instance failover, which utilizes multiple instances of the standby database at the secondary site, and Data Guard failover or Data Guard switchover, which converts the standby database into a production database. The following occurs as a result of apply instance failover:

For apply failover to work correctly, "Configuration Best Practices for MAA" must be followed:

When you follow these configuration recommendations, apply instance failover is automatic for a scheduled or unscheduled outage on the primary instance, and all standby instances have access to archived redo logs. By definition, all RAC standby instances already have access to standby redo logs because they must reside on shared storage.

The method of restarting the physical standby managed recovery process (MRP) or the logical standby apply process (LSP) depends on whether Data Guard Broker is being used. If the Data Guard Broker is in use, then the MRP or LSP is automatically restarted on the first available standby instance if the primary standby instance fails. If the Data Guard Broker is not being used, then the MRP or LSP must be manually restarted on the new standby instance. Consider using a shared file system, such as a clustered file system or a global file system, for the archived redo logs. A shared file system enables you to avoid reshipment of any unapplied archived redo logs that were already shipped to the standby.

See Also:

Oracle Data Guard Concepts and Administration for details about setting up cross-instance archiving

Performing an Apply Instance Failover Using SQL*Plus

If apply instance failover does not happen automatically, then follow these steps to restart your production database, if necessary, and restart MRP or LSP following an unscheduled apply instance or node outage:

Step 1: Ensure That the Chosen Standby Instance is Mounted

From the targeted standby instance, run the following query.

SELECT OPEN_MODE, DATABASE_ROLE FROM V$DATABASE;
Type of Standby Database Output for Mounted Standby Database If Not Mounted, Choose a Different Target or Open Manually

Physical

MOUNTED, PHYSICAL STANDBY

STARTUP NOMOUNT

Logical

READ WRITE, LOGICAL STANDBY

STARTUP

Step 2: Verify Oracle Net Connection to the Chosen Standby Host

  1. Ensure that the standby listener is started.
        % lsnrctl status listener_name
    
    
  2. Validate the Oracle Net alias from all the production hosts in the cluster.
        % tnsping standby_database_connection_service_name
    
    

If the connection cannot be made, then consult Oracle Net Services Administrator's Guide for further troubleshooting.

Step 3: Start Recovery on the Chosen Standby Instance

Use the following statements for a physical standby database:

RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Use the following statements for a logical standby database:

ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;

Step 4: Copy Archived Redo Logs to the New Apply Host

Optionally, copy the archived redo logs to the new apply host.

The copy is not necessary for a physical standby database. For a physical standby database, when the managed recovery process detects an archive gap, it requests the production archived redo logs to be resent automatically.

For a logical standby database, unapplied archive file names that have already been registered are not resent automatically to the logical standby database. These archived redo logs must be sent manaully to the same directory structure in the new apply host. You can identify the registered unapplied archived redo logs by executing a statement similar to the following:

SELECT LL.FILE_NAME, LL.THREAD#, LL.SEQUENCE#, LL.FIRST_CHANGE#, LL.NEXT_CHANGE#, 
LP.APPLIED_SCN, LP.READ_SCN
  FROM DBA_LOGSTDBY_LOG LL, DBA_LOGSTDBY_PROGRESS LP
  WHERE LEAST(LP.APPLIED_SCN, LP.READ_SCN) <= LL.NEXT_CHANGE#;

Compare the results of the statement to the contents of the STANDBY_ARCHIVE_DEST directory.

See Also:

"Oracle9i Data Guard: SQL Apply Best Practices" at http://otn.oracle.com/deploy/availability/htdocs/maa.htm

Step 5: Verify the New Configuration

  1. Verify that archived redo logs are being sent to the new apply host.

    Query V$ARCHIVE_STATUS and V$ARCHIVED_DEST_STATUS.

        SELECT NAME_SPACE, STATUS, TARGET, LOG_SEQUENCE ,             
        TYPE,PROCESS, REGISTER , ERROR FROM V$ARCHIVE_DEST
        WHERE STATUS!='INACTIVE';
    
        SELECT * FROM V$ARCHIVE_DEST_STATUS WHERE STATUS!='INACTIVE';
    
    
  2. Verify that the managed recovery or logical apply on the new apply host is progressing.

    Issue the following queries to ensure that the sequence number is advancing over time.

    Use the following statements for a physical standby database:

         SELECT MAX(SEQUENCE#), THREAD# FROM V$LOG_HISTORY GROUP BY THREAD#;
         SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, CLIENT_PROCESS FROM V$MANAGED_ STANDBY;
    
    

    Use the following statements for a logical standby database:

        SELECT MAX(SEQUENCE#), THREAD# FROM DBA_LOGSTDBY_LOG GROUP BY THREAD#;
        SELECT APPLIED_SCN FROM DBA_LOGSTDBY_PROGRESS;
    

Recovery Solutions for Data Failures

Recovering from a data failure is an unscheduled outage scenario. A data failure is usually, but not always, caused by some activity or failure that occurs outside the database, even though the problem may be evident within the database.

Data failure can affect the following types of database objects:

Data failure can be categorized as either datafile block corruption or media failure:

In all environments, you can resolve a data failure outage by one of the following methods:

In a Data Guard environment, you can also use a Data Guard switchover or failover to a standby database to recover from data failures.

Another category of related outages that result in database objects becoming unavailable or inconsistent are caused by user error, such as dropping a table or erroneously updating table data. Information about recovering from user error can be found in "Recovering from User Error with Flashback Technology".

The rest of this section includes the following topics:

Detecting and Recovering From Datafile Block Corruption

A corrupt datafile block can be accessed, but the contents within the block are invalid or inconsistent. The typical cause of datafile corruption is a faulty hardware or software component in the I/O stack, which includes, but is not limited to, the file system, volume manager, device driver, host bus adapter, storage controller, and disk drive.

The database usually remains available when corrupt blocks have been detected, but some corrupt blocks may cause widespread problems, such as corruption in a file header or with a data dictionary object, or corruption in a critical table that renders an application unusable.

See Also:

The rest of this section includes the following topics:

Detecting Datafile Block Corruption

A data fault is detected when it is recognized by the user, administrator, RMAN backup, or application because it has affected the availability of the application. For example:

Regularly monitor application logs (which may be distributed across the data server, middle-tier and the client machines), the alert log, and Oracle trace files for errors such as ORA-1578 and ORA-1110

ORA-01578: ORACLE data block corrupted (file # 4, block # 26)
ORA-01110: data file 4: '/u01/oradata/objrs/obj_corr.dbf'

Recovering From Datafile Block Corruption

After you have identified datafile block corruption, follow these steps:

  1. Determine the Extent of the Corruption Problem
  2. Replace or Move Away From Faulty Hardware
  3. Determine Which Objects Are Affected
  4. Decide Which Recovery Method to Use
Determine the Extent of the Corruption Problem

Use the following methods to determine the extent of the corruption:

Replace or Move Away From Faulty Hardware

Some corruption problems are caused by faulty hardware. If there is a hardware fault or a suspect component, then it is sensible to either repair the problem, or make disk space available on a separate disk subsystem before proceeding with a recovery option.

If there are multiple errors, if there are operating system-level errors against the affected file, or if the errors are transient and keep moving about, then there is little point in proceeding until the underlying problem has been addressed or space is available on alternative disks. Ask your hardware vendor to verify system integrity.

In a Data Guard environment, a switchover can be performed to bring up the application and restore service quickly while the corruption problem is handled offline.

Determine Which Objects Are Affected

Using the file ID (fid) and block ID (bid) gathered from error messages and the output from Oracle block checking utilities, determine which database objects are affected by the corruption by using a query similar to the following:

SELECT tablespace_name, partition_name, segment_type,
       owner, segment_name
FROM dba_extents 
WHERE file_id = fid
  AND bid BETWEEN block_id AND block_id + blocks - 1;

The following is an example of the query and its resulting output:

SQL> select tablespace_name, partition_name, segment_type,
  2  owner, segment_name from dba_extents 
  3  where file_id=4 and 11 between block_id and block_id + blocks -1;

TABLESPACE_NAME   PARTITION_NAME     SEGMENT_TYPE    OWNER   SEGMENT_NAME
---------------   --------------     ------------    -----   ------------
USERS                                TABLE           SCOTT   EMP

The integrity of a table or index can be determined by using the ANALYZE statement.

Decide Which Recovery Method to Use

The recommended recovery methods are summarized in Table 10-3 and Table 10-4. The recovery methods depend on whether Data Guard is being used.

Table 10-3 summarizes recovery methods for data failure when Data Guard is not used.

Table 10-3 Recovering From Data Failure Without Data Guard  
Object Affected Extent of Problem Action

Data dictionary or UNDO segment

N/A

Use RMAN Datafile Media Recovery

Application segment (user table, index, cluster)

Widespread or unknown

Use RMAN Datafile Media Recovery

or

Re-Create Objects Manually

Application segment (user table, index, cluster)

Localized

Use RMAN Block Media Recovery

or

Re-Create Objects Manually

TEMPORARY segment or temporary table

N/A

No impact to permanent objects. Re-create temporary tablespace if required.

Table 10-4 summarizes recovery methods for data failure when Data Guard is present.

Table 10-4 Recovering from Data Failure With Data Guard  
Object Affected Extent of Problem Impact to Application Cost of Local Recovery Action

Data dictionary or UNDO segment

N/A

N/A

N/A

Use Data Guard to Recover From Data Failure

Application segment (user table, index, cluster)

Widespread or unknown

Low

Low

Use RMAN Datafile Media Recovery

or

Re-Create Objects Manually

Application segment (user table, index, cluster)

N/A

High

N/A

Use Data Guard to Recover From Data Failure

Application segment (user table, index, cluster)

N/A

N/A

High

Use Data Guard to Recover From Data Failure

Application segment (user table, index, cluster)

Localized

Low

Low

Use RMAN Block Media Recovery

or

Re-Create Objects Manually

TEMPORARY segment or temporary table

N/A

N/A

N/A

No impact to permanent objects. Re-create temporary tablespace if required.

The proper recovery method to use depends on the following criteria, as indicated in Table 10-3 and Table 10-4:

Recovering From Media Failure

When media failure occurs, follow these steps:

  1. Determine the Extent of the Media Failure
  2. Replace or Move Away From Faulty Hardware
  3. Decide Which Recovery Action to Take
Determine the Extent of the Media Failure

Use the following methods to determine the extent of the media failure:

Replace or Move Away From Faulty Hardware

If there is a hardware fault or a suspect component, then it is sensible to either repair the problem or make disk space available on a separate disk subsystem before proceeding with a recovery option.

If there are multiple errors, if there are operating system-level errors against the affected file, or if the errors are transient and keep moving about, then there is little point in proceeding until the underlying problem has been addressed or space is available on alternative disks. Ask your hardware vendor to verify system integrity.

Decide Which Recovery Action to Take

The appropriate recovery action depends on what type of file is affected by the media failure. Table 10-5 shows the type of file and the appropriate recovery.

Table 10-5 Recovery Actions for Failure of Different Types of Files  
Type of File Recovery Action

Datafile

Media failure of a datafile is resolved in the same manner in which widespread datafile block corruption is handled.

See Also: "Recovering From Datafile Block Corruption"

Control file

Loss of a control file causes the primary database to shut down. The steps to recover from control file failure include making a copy of a good control file, restoring a backup copy of the control file, or manually creating the control file with the CREATE CONTROLFILE statement. The proper recovery method depends on the following:

  • Whether all current control files were lost or just a member of a multiplexed control file
  • Whether or not a backup control file is available

See Also: "Performing User-Managed Flashback and Recovery" in Oracle Database Backup and Recovery Advanced User's Guide

Standby control file

Loss of a standby control file causes the standby database to shut down. It may also, depending on the primary database protection mode, cause the primary database to shut down. To recover from a standby control file failure, a new standby control file must be created from the primary database and transferred to the standby system.

See Also: "Creating a Physical Standby Database" in Oracle Data Guard Concepts and Administration

Online redo log file

If a media failure has affected the online redo logs of a database, then the appropriate recovery procedure depends on the following:

  • The configuration of the online redo log: mirrored or non-mirrored
  • The type of media failure: temporary or permanent
  • The status of the online redo log files affected by the media failure: current, active, unarchived, or inactive

See Also: "Advanced User-Managed Recovery Scenarios" in Oracle Database Backup and Recovery Advanced User's Guide

If the online redo log failure causes the primary database to shut down and incomplete recovery must be used to make the database operational again, then Flashback Database can be used instead of restoring all datafiles. Use Flashback Database to take the database back to an SCN before the SCN of the lost online redo log group. The resetlogs operation that is done as part of the Flashback Database procedure reinitializes all online redo log files. Using Flashback Database is faster than restoring all datafiles.

If the online redo log failure causes the primary database to shut down in a Data Guard environment, it may be desirable to perform a Data Guard failover to reduce the time it takes to restore service to users and to reduce the amount of data loss incurred (when using the proper database protection mode). The decision to perform a failover (instead of recovering locally at the primary site with Flashback Database, for example) depends on the estimated time to recover from the outage at the primary site, the expected amount of data loss, and the impact the recovery procedures taken at the primary site may have on the standby database.

For example, if the decision is to recover at the primary site, then the recovery steps may require a Flashback Database and open resetlogs, which may incur a full redo log file of lost data. A standby database will have less data loss in most cases than recovering at the primary site because all redo data is available to the standby database. If recovery is done at the primary site and the standby database is ahead of the point to which the primary database is recovered, then the standby database must be re-created or flashed back to a point before the resetlogs SCN on the primary database.

See Also: "Creating a Physical Standby Database" in Oracle Data Guard Concepts and Administration

Standby redo log file

Standby redo log failure affects only the standby database in a Data Guard environment. Most standby redo log failures are handled automatically by the standby database without affecting the primary database. However, if a standby redo log file fails while being archived to, then the primary database treats it as a log archive destination failure.

See Also: "Determine the Data Protection Mode"

Archived redo log file

Loss of an archived redo log does not affect availability of the primary database directly, but it may significantly affect availability if another media failure occurs before the next scheduled backup, or if Data Guard is being used and the archived redo log had not been fully received by the standby system and applied to the standby database before losing the file.

See Also: "Advanced User-Managed Recovery Scenarios" in Oracle Database Backup and Recovery Advanced User's Guide

If an archived redo log is lost in a Data Guard environment and the log has already been applied to the standby database, then there is no impact. If there is no valid backup copy of the lost file, then a backup should be taken immediately of either the primary or standby database because the lost log will be unavailable for media recovery that may be required for some other outage.

If the lost archived redo log has not yet been applied to the standby database, then a backup copy of the file must be restored and made available to the standby database. If there is no valid backup copy of the lost archived redo log, then the standby database must be reinstantiated from a backup of the primary database taken after the NEXT_CHANGE# of the lost log (see V$ARCHIVED_LOG).

Server parameter file (SPFILE)

Loss of the server parameter file does not affect availability of the database. SPFILE is necessary for database startup. With the flash recovery and RMAN CONTROLFILE AUTOBACKUP features enabled, restoring a server parameter file from backup is a fast operation.

See Also: "Performing Recovery" of Oracle Database Backup and Recovery Basics

Oracle Cluster Registry (OCR)

Loss of the Oracle Cluster Registry file affects the availability of RAC and Cluster Ready Services. The OCR file can be restored from a physical backup that is automatically created or from an export file that is manually created by using the ocrconfig tool.

See Also: "Administering Storage in Real Application Clusters" in Oracle Real Application Clusters Administrator's Guide

Recovery Methods for Data Failures

The following recovery methods can be used in all environments:

Always use local recovery methods when Data Guard is not being used. Local recovery methods may also be appropriate in a Data Guard environment. This section also includes the following topic:

Use RMAN Datafile Media Recovery

Datafile media recovery recovers an entire datafile or set of datafiles for a database by using the RMAN RECOVER command. When a large or unknown number of data blocks are marked media-corrupt and require media recovery, or when an entire file is lost, the affected datafiles must be restored and recovered.

Use RMAN file media recovery when the following conditions are true:

Use RMAN Block Media Recovery

Block media recovery (BMR) recovers one or a set of data blocks marked "media corrupt" within a datafile by using the RMAN BLOCKRECOVER command. When a small number of data blocks are marked media corrupt and require media recovery, you can selectively restore and recover damaged blocks rather than whole datafiles. This results in lower mean time to recovery (MTTR) because only blocks that need recovery are restored and only necessary corrupt blocks undergo recovery. Block media recovery minimizes redo application time and avoids I/O overhead during recovery. It also enables affected datafiles to remain online during recovery of the corrupt blocks. The corrupt blocks, however, remain unavailable until they are completely recovered.

Use block media recovery when:

Block media recovery cannot be used to recover from the following:

The following are useful practices when using block media recovery:

Re-Create Objects Manually

Some database objects, such as small look-up tables or indexes, can be recovered quickly by manually re-creating the object instead of doing media recovery.

Use manual object re-creation when:

Use Data Guard to Recover From Data Failure

Failover is the operation of taking the production database offline on one site and bringing one of the standby databases online as the new production database. A database switchover is a planned transition in which a standby database and a production database switch roles.

Use Data Guard switchover or failover for data failure when:

Recovering from User Error with Flashback Technology

Oracle flashback technology revolutionizes data recovery. In the past it took seconds to damage a database but hours to days to recover it. With flashback technology, the time to correct errors can be as short as the time it took to make the error. Fixing user errors that require rewinding the database, table, transaction, or row level changes to a previous point in time is easy and does not require any database or object restoration. Flashback technology provides fine-grained analysis and repair for localized damage such as erroneous row deletion. Flashback technology also enables correction of more widespread damage such as accidentally running the wrong application batch job. Furthermore, flashback technology is exponentially faster than a database restoration.

Flashback technologies are applicable only to repairing the following user errors:

Flashback technologies cannot be used for media or data corruption such as block corruption, bad disks, or file deletions. See "Recovery Solutions for Data Failures" and "Database Failover" to repair these outages.

Table 10-6 summarizes the flashback solutions for each type of outage.

Table 10-6 Flashback Solutions for Different Outages  
Impact of Outage Examples of User Errors Flashback Solutions

Row or transaction

See Also: "Resolving Row and Transaction Inconsistencies"

  • Accidental deletion of row
  • Erroneous transaction

Use a combination of:

See Also: "Flashback Query"

Table

See Also:"Resolving Table Inconsistencies"

  • Dropped table
  • Erroneous transactions affecting one table or a set of tables

Tablespace or database

See Also: "Resolving Database-Wide Inconsistencies"

  • Erroneous batch job affecting many tables or an unknown set of tables
  • Series of database-wide malicious transactions
  • Drop tablespace without removing the physical datafiles

Table 10-7 summarizes each flashback feature.

Table 10-7 Summary of Flashback Features  
Flashback Feature Description

Flashback Query

Flashback Query enables you to view data at a point in time in the past. It can be used to view and reconstruct lost data that was deleted or changed by accident. Developers can use this feature to build self-service error correction into their applications, empowering end users to undo and correct their errors.

Note: Changes are propagated to physical and logical standby databases.

Flashback Version Query

Flashback Version Query uses undo data stored in the database to view the changes to one or more rows along with all the metadata of the changes.

Note: Changes are propagated to physical and logical standby databases.

Flashback Transaction Query

Flashback Transaction Query enables you to examine changes to the database at the transaction level. As a result, you can diagnose problems, perform analysis, and audit transactions.

Note: Changes are propagated to physical and logical standby databases.

Flashback Drop

Flashback Drop provides a way to restore accidentally dropped tables.

Note: Changes are propagated to physical standby databases.

Flashback Table

Flashback Table enables you to quickly recover a table to a point in time in the past without restoring a backup.

Note: Changes are propagated to physical and logical standby databases.

Flashback Database

Flashback Database enables you to quickly return the database to an earlier point in time by undoing all of the changes that have taken place since that time. This operation is fast because you do not need to restore the backups.

Flashback Database uses the Oracle Database flashback logs, while all other features of flashback technology use the Oracle Database unique undo and multiversion read consistency capabilities. See "Configuration Best Practices for the Database" for configuring flashback technologies to ensure that the resources from these solutions are available at a time of failure.

The rest of this section includes the following topics:

Resolving Row and Transaction Inconsistencies

Resolving row and transaction inconsistencies may require a combination of Flashback Query, Flashback Version Query, Flashback Transaction Query, and the suggested undo statements to rectify the problem. The following sections describe a general approach using a human resources example to resolve row and transaction inconsistencies caused by erroneous or malicious user errors.

This section includes the following topics:

Flashback Query

Flashback Query, a feature introduced in the Oracle9i Database, enables an administrator or user to query any data at some point in time in the past. This powerful feature can be used to view and reconstruct data that may have been deleted or changed by accident. For example:

SELECT * FROM EMPLOYEES 
       AS OF TIMESTAMP 
       TO_DATE('28-Aug-03 14:00','DD-Mon-YY HH24:MI')
 WHERE ...

This partial statement displays rows from the EMPLOYEES table starting from 2 p.m. on August 28, 2003. Developers can use this feature to build self-service error correction into their applications, empowering end users to undo and correct their errors without delay, rather than burdening administrators to perform this task. Flashback Query is very simple to manage, because the database automatically keeps the necessary information to reconstruct data for a configurable time into the past.

Flashback Version Query

Flashback Version Query provides a way to view changes made to the database at the row level. It is an extension to SQL and enables the retrieval of all the different versions of a row across a specified time interval. For example:

SELECT * FROM EMPLOYEES
       VERSIONS BETWEEN TIMESTAMP
       TO_DATE('28-Aug-03 14:00','dd-Mon-YY hh24:mi') AND
       TO_DATE('28-Aug-03 15:00','dd-Mon-YY hh24:mi')
WHERE ...

This statement displays each version of the row, each entry changed by a different transaction, between 2 and 3 p.m. today. A DBA can use this to pinpoint when and how data is changed and trace it back to the user, application, or transaction. This enables the DBA to track down the source of a logical corruption in the database and correct it. It also enables application developers to debug their code.

Flashback Transaction Query

Flashback Transaction Query provides a way to view changes made to the database at the transaction level. It is an extension to SQL that enables you to see all changes made by a transaction. For example:

SELECT UNDO_SQL
FOMR DBA_TRANSACTION_QUERY 
WHERE XID = '000200030000002D';

This statement shows all of the changes that resulted from this transaction. In addition, compensating SQL statements are returned and can be used to undo changes made to all rows by this transaction. Using a precision tool like this, the DBA and application developer can precisely diagnose and correct logical problems in the database or application.

Example: Using Flashback Technology to Investigate Salary Discrepancy

Consider a human resources (HR) example involving the SCOTT schema. the HR manager reports to the DBA that there is a potential discrepancy in Ward's salary. Sometime before 9:00 a.m., Ward's salary was increased to $1875. The HR manager is uncertain how this occurred and wishes to know when the employee's salary was increased. In addition, he has instructed his staff to reset the salary to the previous level of $1250, and this was completed around 9:15 a.m.

The following steps show how to approach the problem.

  1. Assess the problem.

    Fortunately, the HR manager has provided information about the time when the change occurred. We can query the information as it was at 9:00 a.m. with Flashback Query.

        SELECT EMPNO, ENAME, SAL
        FROM EMP
        AS OF TIMESTAMP TO_DATE('03-SEP-03 09:00','dd-Mon-yy hh24:mi')
        WHERE ENAME = 'WARD';
    
             EMPNO ENAME             SAL
        ---------- ---------- ----------
              7521 WARD             1875
    
    

    We can confirm we have the correct employee by the fact that Ward's salary was $1875 at 09:00 a.m. Rather than using Ward's name, we can now use the employee number for subsequent investigation.

  2. Query past rows or versions of the data to acquire transaction information.

    Although it is possible to restrict the row version information to a specific date or SCN range, we decide to query all the row information that we have available for the employee WARD using Flashback Version Query.

        SELECT EMPNO, ENAME, SAL, VERSIONS_STARTTIME, VERSIONS_ENDTIME
        FROM EMP
        VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
        WHERE EMPNO = 7521
        ORDER BY NVL(VERSIONS_STARTSCN,1);
    
            EMPNO ENAME             SAL VERSIONS_STARTTIME     VERSIONS_ENDTIME
        -------- ---------- ---------- ---------------------- ----------------------
             7521 WARD             1250 03-SEP-03 08.48.43 AM  03-SEP-03 08.54.49 AM
             7521 WARD             1875 03-SEP-03 08.54.49 AM  03-SEP-03 09.10.09 AM
             7521 WARD             1250 03-SEP-03 09.10.09 AM
    
    

    We can see that WARD's salary was increased from $1250 to $1875 at 08:54:49 the same morning and was subsequently reset to $1250 at approximately 09:10:09.

    Also, we can modify the query to determine the transaction information for each of the changes effecting WARD using a similar Flashback Version Query. This time we use the VERSIONS_XID pseudocolumn.

        SELECT EMPNO, ENAME, SAL, VERSIONS_XID
        FROM EMP
        VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
        WHERE EMPNO = 7521
        ORDER BY NVL(VERSIONS_STARTSCN,1);
    
             EMPNO ENAME             SAL VERSIONS_XID
        ---------- ---------- ---------- ----------------
              7521 WARD             1250 0006000800000086
              7521 WARD             1875 0009000500000089
              7521 WARD             1250 000800050000008B
    
    
  3. Query the erroneous transaction and the scope of its impact.

    With the transaction information (VERSIONS_XID pseudocolumn), we can now query the database to determine the scope of the transaction, using Flashback Transaction Query.

        SELECT UNDO_SQL
        FROM FLASHBACK_TRANSACTION_QUERY
        WHERE XID = HEXTORAW('0009000500000089');
    
        UNDO_SQL                                                                    
        ----------------------------------------------------------------------------
        update "SCOTT"."EMP" set "SAL" = '950' where ROWID = 'AAACV4AAFAAAAKtAAL';      
        update "SCOTT"."EMP" set "SAL" = '1500' where ROWID = 'AAACV4AAFAAAAKtAAJ';     
        update "SCOTT"."EMP" set "SAL" = '2850' where ROWID = 'AAACV4AAFAAAAKtAAF';      
        update "SCOTT"."EMP" set "SAL" = '1250' where ROWID = 'AAACV4AAFAAAAKtAAE';    
        update "SCOTT"."EMP" set "SAL" = '1600' where ROWID = 'AAACV4AAFAAAAKtAAB';     
                                                                                    
        6 rows selected.
    
    

    We can see that WARD's salary was not the only change that occurred in the transaction. The information that was changed for the other four employees at the same time as WARD can now be passed back to the HR manager for review.

  4. Determine if the corrective statements should be executed.

    If the HR manager decides that the corrective changes suggested by the UNDO_SQL column are correct, then the DBA can execute these statements individually.

Resolving Table Inconsistencies

Oracle provides a FLASHBACK DROP statement to recover from an accidental DROP TABLE statement, and a FLASHBACK TABLE statement to restore a table to a previous point in the database.

This section includes the following topics:

Flashback Table

Flashback Table provides the DBA the ability to recover a table, or a set of tables, to a specified point in time quickly and easily. In many cases, Flashback Table alleviates the need to perform more complicated point in time recovery operations. For example:

FLASHBACK TABLE orders, order_items 
      TO TIMESTAMP 
      TO_DATE('29-AUG-03 14.00.00','dd-Mon-yy hh24:mi:ss');

This statement rewinds any updates to the ORDERS and ORDER_ITEMS tables that have been done between the current time and specified timestamp in the past. Flashback Table performs this operation online and in place, and it maintains referential integrity constraints between the tables.

Flashback Drop

Dropping or deleting database objects by accident is a common mistake. Users soon realize their mistake, but by then it is too late and there has been no way to easily recover the dropped tables and its indexes, constraints, and triggers. Objects once dropped were dropped forever. Loss of very important tables or other objects (like indexes, partitions or clusters) required DBAs to perform a point-in-time recovery, which can be time-consuming and lead to loss of recent transactions.

Flashback Drop provides a safety net when dropping objects in Oracle Database 10g. When a user drops a table, Oracle places it in a recycle bin. Objects in the recycle bin remain there until the user decides to permanently remove them or until space limitations begin to occur on the tablespace containing the table. The recycle bin is a virtual container where all dropped objects reside. Users can look in the recycle bin and undrop the dropped table and its dependent objects. For example, the employees table and all its dependent objects would be undropped by the following statement:

FLASHBACK TABLE employees TO BEFORE DROP;

Resolving Database-Wide Inconsistencies

Oracle provides Flashback Database to rewind the entire database to a previous point in time. This section includes the following topics:

Flashback Database

To bring an Oracle database to a previous point in time, the traditional method is point-in-time recovery. However, point-in-time recovery can take hours or even days, since it requires the whole database to be restored from backup and recovered to the point in time just before the error was introduced into the database. With the size of databases constantly growing, it will take hours or even days just to restore the whole database.

Flashback Database is a new strategy for doing point-in-time recovery. It quickly rewinds an Oracle database to a previous time to correct any problems caused by logical data corruption or user error. Flashback logs are used to capture old versions of changed blocks. One way to think of it is as a continuous backup or storage snapshot. When recovery needs to be performed the flashback logs are quickly replayed to restore the database to a point in time before the error and just the changed blocks are restored. It is extremely fast and reduces recovery time from hours to minutes. In addition, it is easy to use. A database can be recovered to 2:05 p.m. by issuing a single statement. Before the database can be recovered, all instances of the database must be shut down and one of the instances subsequently mounted. The following is an example of a FLASHBACK DATABASE statement.

FLASHBACK DATABASE TO TIMESTAMP TIMESTAMP'2002-11-05 14:00:00';

No restoration from tape, no lengthy downtime, and no complicated recovery procedures are required to use it. You can also use Flashback Database and then open the database in read-only mode and examine its contents. If you determine that you flashed back too far or not far enough, then you can reissue the FLASHBACK DATABASE statement or continue recovery to a later time to find the proper point in time before the database was damaged. Flashback Database works with a production database, a physical standby database, and a logical standby database.

These steps are recommended for using Flashback Database:

  1. Determine the time or the SCN to which to flash back the database.
  2. Verify that there is sufficient flashback log information.
         SELECT OLDEST_FLASHBACK_SCN, 
           TO_CHAR(OLDEST_FLASHBACK_TIME, 'mon-dd-yyyy HH:MI:SS') 
           FROM V$FLASHBACK_DATABASE_LOG;
    
    
  3. Flash back the database to a specific time or SCN. (The database must be mounted to perform a Flashback Database.)
        FLASHBACK DATABASE TO SCN scn;
    
    

    or

        FLASHBACK DATABASE TO TIMESTAMP TO_DATE date;
    
    
  4. Open the database in read-only mode to verify that it is in the correct state.
        ALTER DATABASE OPEN READ ONLY;
    
    

    If more flashback data is required, then issue another FLASHBACK DATABASE statement. (The database must be mounted to perform a Flashback Database.)

    If you want to move forward in time, issue a statement similar to the following:

         RECOVER DATABASE UNTIL [TIME | CHANGE] date | scn;
    
    
  5. Open the database.


    Caution:

    After you open the database, you will not be able to flash back to an SCN before the resetlogs SCN, so ensure that the database is in the correct state before issuing the ALTER DATABASE OPEN RESETLOGS statement.


        ALTER DATABASE OPEN RESETLOGS;
    
    

Other considerations when using Flashback Database are as follows:

Using Flashback Database to Repair a Dropped Tablespace

Flashback Database does not automatically fix this problem, but it can be used to dramatically reduce the downtime. You can flash back the production database to a point before the tablespace was dropped and then restore a backup of the corresponding datafiles from the affected tablespace and recover to a time before the tablespace was dropped.

Follow these recommended steps to use Flashback Database to repair a dropped tablespace:

  1. Determine the SCN or time you dropped the tablespace.
  2. Flash back the database to a time before the tablespace was dropped. You can use a statement similar to the following:
        FLASHBACK DATABASE TO BEFORE SCN drop_scn;
    
    
  3. Restore, rename, and bring datafiles online.
  1. Restore only the datafiles from the affected tablespace from a backup.
  2. Rename the unnamed files to the backup files.
             ALTER DATABASE RENAME FILE '.../UNNAMED00005' to 'restored_file';
    
    
  3. Bring the datafiles online.
             ALTER DATABASE DATAFILE 'name' ONLINE;