8. Best Practices for Backup and Recovery with Recovery Manager

Managing database backup and recovery operations remains one of the primary responsibilities of an Oracle DBA. This task includes developing the robust backup and recovery strategies that comply with the service level agreement (SLA). In general, developing a single strategy for all databases is not recommended; you should develop backup and recovery strategies that align with the nature of the application business criticality and demand. Oracle provides various options to perform database backup and recovery operations and to optimize backup and recovery operations to minimize the backup duration. The objective of this chapter is not to provide general concepts, syntax, or examples on how to perform backup and recovery; rather, it proposes the most vital ingredients needed to develop the best backup and recovery strategies for the databases. The chapter also focuses on various database recovery scenarios.

A Perfect Backup and Recovery Plan

Data is a crucial element to any business; that’s why organizations invest huge sums of money in state-of-the-art architecture and technologies to ensure data is highly available and protected from all sorts of data losses. In addition, data growth continues to be phenomenal in most organizations, so organizations must carefully plan their backup and recovery plan. There are several options offered by different vendors at both the software and hardware levels to protect the data.

While designing backup and recovery solutions, keep in mind that it is not as simple as backing up the data and doing on-demand recovery. Most important, consequences that could impact the business must be addressed. You must ensure the backup operation doesn’t impact the overall performance of the database and the server during the backup and that it doesn’t consume huge system resources. You must also consider adherence to the recovery SLA, meet the recovery clauses defined by recovery point objective (RPO) and recovery time objective (RTO), and achieve almost zero data loss.

When you design and implement database backup strategies, it’s important to keep the following crucial recovery factors in mind:

Image How much data loss is acceptable?

Image How long can the business afford to have the applications that the database supports be inaccessible?

Image What is the nature and business criticality of the supported applications?

Image What are the possible data loss scenarios, and how can you overcome them?

Image How quickly can the database be brought back online, and what is the best and optimal recovery option to achieve that goal?

Image What are other possible solutions (i.e., hardware and/or software alternatives) in case of a prolonged application downtime?

Image What are the alternatives available in case your recovery strategy should fail due to unforeseen loss of components necessary for a complete restoration and recovery?

An Overview of RMAN

Oracle offers a variety of solutions to suit every modern business’s needs. For business continuity, you can configure Oracle’s Data Guard; for high availability, you can configure Oracle Real Application Clusters (RAC). When it comes to protecting the crucial data, Oracle provides multiple tools and utilities to database backups, both logical and physical.

Recovery Manager (RMAN) is a utility that simplifies and performs all sorts of Oracle database–related backup and recovery tasks. Unlike other Oracle advanced features that are separately licensed, RMAN comes at no additional cost and is installed by default. All that you need to do with RMAN is perform some careful planning to back up your databases, test your recovery scenarios, and perhaps do a little tuning to your RMAN configuration.

With RMAN, you can achieve the following goals:

Image Create physical backups at various levels: full database, tablespace, or individual datafiles.

Image Create compressed and encrypted backups.

Image Perform incremental (differential and cumulative) database backups at database and tablespace levels.

Image Back up to disk or tape.

Image Perform fast incremental backup using block change tracking (BCT) to accelerate the backup window.

Image Validate the database and its corresponding backups without actually performing recovery operation.

Image Clone an existing database for test environments.

Image Leverage recovery scenarios at various levels, including a complete database point-in-time recovery, a tablespace point-in-time recovery, and a table-level point-in-time recovery.

Tips for Database Backup Strategies

Each database has a different SLA based on its necessary availability, recoverability, and business criticality; therefore, an Oracle DBA who is responsible for managing variously sized and prioritized databases should design backup and recovery strategies that are appropriate and suits each database’s SLA. When you design and implement backup and recovery solutions, it is essential that you seriously consider several important factors, such as backup duration and the server resource consumption versus the impact on the database’s overall performance.

The following key elements should be kept in mind and addressed while designing an appropriate solution:

Image The business criticality of the data and its availability

Image The RPO and RTO

Image The frequency at which backups should be taken to guarantee the database’s recoverability

Image The size of the database and whether all of its components must be available all of the time

Image Potential recovery scenarios

Once you have addressed these issues, the next step is to develop an appropriate backup strategy. The subsequent sections offer tips and techniques for constructing the best strategy for your database.

Let’s take a close look at some of the best database backup approaches for mission-critical production databases. Once you thoroughly understand the benefits and drawbacks of each strategy, you can choose a strategy that best fits your database’s business needs.

Full Backups and Incremental Backups

Under most circumstances for just about any database, it is highly recommended that you perform a full database backup over the weekend and perform incremental backups subsequently through the week. Again, this strategy may not fit for every database, as it will depend on the size of the database; however, for most Oracle databases that are less than 1 TB in size, this strategy is commonly adopted.

For example, the following example shows how to create an RMAN full database backup as an incremental level 0 backup set, which backs up the database and all archive logs to tape, preferably run over the weekend to avoid any business impact.

RMAN> run
{
    allocate channel ch1 type 'SBT_TAPE';
    allocate channel ch2 type 'SBT_TAPE';
    BACKUP INCREMENTAL LEVEL=0
      FORMAT 'bk_u%u_s%s_p%p_t%t'
      DATABASE PLUS ARCHIVELOG;
    release channel ch1;
    release channel ch2;
}

Compressed Backups

Data compression offers great storage savings. Oracle RMAN supports multiple compression methods. Null block compression, which is the default compression method, is part of incremental level 0 backups, and it ignores all blocks in a tablespace datafile that were never used. In addition, unused block compression offers even more storage savings by skipping empty data blocks that currently hold no data, regardless of whether they previously held any data.

RMAN also offers binary compression of the backup files it creates. The basic and medium compression levels require licensing the Oracle Advanced Compression option; the higher settings offer greater and greater compression, but be warned that they also utilize significant  CPU during backup.

While null and unused block compression take place by default whenever an incremental level 0 backup of a datafile is performed, binary compression must be defined explicitly within the RMAN run script, as demonstrated in the following example:

--- RMAN binary compressed full database online backups
RMAN> run {
    BACKUP
    AS COMPRESSED BACKUPSET
    DATABASE PLUS ARCHIVELOG;
}

Decompression takes place automatically during the restore operation; however, the decompression operation will use a significant amount of the server’s CPU, which will affect the speed of any recovery operation. It’s therefore wise to use the compressed backups option when there is lack of sufficient space to keep the backup files or when backups are created at a remote location over a network with limited bandwidth.

Incremental Backups

Yet another serious concern for backup operations is to reduce the amount of data that needs to be backed up, decreasing the amount of time required to back up. The objective of an incremental level 1 backup is to back up less data and to minimize the time required to capture all changed database blocks during a regularly scheduled backup cycle.

RMAN has the ability to perform incremental backups at two different levels: differential and cumulative. A differential incremental backup is the default type, which backs up all data blocks that have been modified since the previous level 0 or level 1 backup. A cumulative incremental backup backs up all data blocks that have been modified since the last level 0 backup. In contrast to a differential incremental backup, a cumulative incremental backup consumes more backup space and will probably take additional time too.

On the other hand, when database restore and recovery is performed, cumulative backups require the level 0 backup and only the latest valid cumulative backup for restore, which will offer faster recovery in contrast to differential incremental backups. If the additional storage and a little extra backup time is not a concern for your business, cumulative incremental backups is the best option considering the recovery factor.

Faster Incremental Backups

Data growth these days can be quite unpredictable; it is not unusual for a database’s growth to suddenly accelerate and then continue at an exceptional rate. Backing up a multi-terabyte-sized database can be very time consuming and resource hungry. Although incremental backups help to overcome this situation, Oracle still has to visit every single data block in each datafile to verify whether it needs to be backed up, which could be time consuming. Oracle offers faster incremental backups through its BCT feature. When BCT is enabled, each modified data block is recorded in a binary file outside the database. RMAN refers to the file to identify the data blocks that are candidates for incremental backups, which improves the overall time required for incremental backups.

To enable BCT, use the following command to create and maintain the BCT file in the location specified via the DB_CREATE_FILE_DEST parameter in the target database:

SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING;

Alternatively, to create and maintain the BCT file in a specific location, use the following command:

SQL> ALTER DATABASE BLOCK CHANGE TRACKING USING FILE '<location/filename>';

Once BCT is enabled, initiate a full database backup before an incremental backup is performed. The following RMAN run block will create a cumulative incremental level 1 backup:

# RMAN daily cumulative incremental database online backups
RMAN> run
{
    allocate channel ch1 type 'SBT_TAPE';
    BACKUP
       INCREMENTAL LEVEL=1 CUMULATIVE
       FORMAT 'bk_u%u_s%s_p%p_t%t'
       DATABASE PLUS ARCHIVELOG;
    release channel ch1;
}

It is really important to have a valid full database level 0 backup before executing an incremental level 1 backup with BCT.

Rewinding in Oracle Flashback Technology

Oracle Flashback technology offers an additional layer of data protection and provides the ability to quickly “rewind” all or part of a database to a prior state. Oracle Flashback supports a wide range of features:

Image Perform an incomplete recovery of the entire database to a prior point in time without actually restoring any datafile with Flashback Database.

Image Recover a tablespace set to a prior point in time through tablespace point-in-time recovery (TSPITR).

Image Recover a single table to a prior point in time with table point-in-time recovery (new in Oracle Database 12c).

Image Rewind all or part of an individual transaction to a prior point in time with Flashback Transaction.

Flashback Database and the ability to set a guaranteed restore point is most useful when you plan to perform database upgrades and while testing different recovery scenarios. It will also help to recover from severe human error by rewinding the database to a consistent state before the error more quickly than an incomplete database recovery operation, because it doesn’t require the restoration of any datafile. (Of course, Flashback Database is not viable when a datafile has been a victim of media corruption or was physically deleted; those issues require regular media recovery techniques to complete the recovery.)

To enable Flashback Database features, the following needs to be done:

Image Configure the database’s fast recovery area (FRA) by setting appropriate values for its location and size via the DB_RECOVERY_FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST initialization parameters.

Image Enable flashback logging by setting the DB_FLASHBACK_RETENTION_TARGET initialization parameter to an appropriate value (the default is 1440 minutes, or one full day).

Image Activate Flashback Database by issuing the ALTER DATABASE FLASHBACK ON; command.

Flashback features are quite flexible; in fact, it’s possible to exclude an entire tablespace via the ALTER TABLESPACE <tablespace_name> FLASHBACK OFF; command.

To disable flashback features at the database level, simply issue the ALTER DATABASE FLASHBACK OFF; command.

Disk-Based Backup Solutions

In general, disk-based backup and recovery operations are significantly faster than tape-based backup and recovery. Since disk storage is getting less expensive these days, organizations that can afford sufficient storage for their backups can adopt a disk-based backups solution to achieve better backup and recovery.

Recover Forward Forever

Many Oracle DBAs are not even aware that when they select Oracle’s suggested backup strategy through Oracle Enterprise Manager, they are actually choosing to implement its recommended recover forward forever (RFF) backup strategy. This strategy—sometimes called incrementally updatable image copies (IUIC)—makes it possible to perform an initial incremental level 0 image copy of all datafiles for your database just one time and subsequently gather incremental level 1 differential backups for those datafiles.

Once a database’s RPO has been satisfied, the oldest incremental level 1 backups are applied to the initial image copy, thus recovering it forward by just one day. This cycle is repeated for the remainder of the database’s life. The advantage of this process is obvious: the initial level 0 image copy will never need to be taken again, and only level 1 backups will need to be taken, thus significantly reducing the amount of time required to perform database backups.

During a complete media recovery operation, the required datafiles are restored from the most recent image copy of the datafiles, level 1 incremental backups are used to roll the datafiles forward as far as possible, and then archived redo and online redo logs are applied to bring the datafiles to the current SCN. For an incomplete recovery operation, this same technique can be leveraged except that recovery is terminated short of the current SCN. This strategy helps to minimize the overall database restore and recovery window and is much faster than traditional recovery methods.

Here is an example of an RMAN run block that implements the nightly backup processing for the RFF strategy:

RMAN> run
{
    RECOVER COPY OF DATABASE WITH TAG 'weekly_iud';
    BACKUP
       INCREMENTAL LEVEL 1
       FOR RECOVER OF COPY WITH TAG 'weekly_iud'
       DATABASE
       PLUS ARCHIVELOG;
}

One perceived disadvantage of this backup method is that the image copies of the datafiles require the same amount of storage as the database itself. However, this method does offer the fastest way to recover from loss of a datafile because it’s a simple matter to switch to the image copy stored in the FRA and perform a media recovery using that datafile.

The example that follows shows how RFF would be implemented over three separate backup cycles:

#####
# Results from 1st run:
# - Incremental Level 0 image copies of all datafiles now exist
#   because even though a Level 1 incremental backup was requested,
#   no Level 0 incremental backup has yet been created with the tag
#   of img_cpy_upd.
#####

RMAN> # Implement Incrementally-Updateable Image Copy (IUIC) strategy
2> # for RPO of 24 hours (i.e., RECOVERY WINDOW OF ONE DAY)
3> RUN {
4> RECOVER
5>    COPY OF DATABASE
6>    WITH TAG 'img_cpy_upd';
7> BACKUP
8>    INCREMENTAL LEVEL 1
9>    FOR RECOVER OF COPY WITH TAG 'img_cpy_upd'
10>    SECTION SIZE 100M
11>    DATABASE
12>    PLUS ARCHIVELOG DELETE INPUT;
13> }
Starting recover at 2014-04-07 13:17:07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=496 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=13 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=38 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=486 device type=DISK
no copy of datafile 1 found to recover
no copy of datafile 2 found to recover
no copy of datafile 3 found to recover
no copy of datafile 4 found to recover
no copy of datafile 5 found to recover
no copy of datafile 6 found to recover
no copy of datafile 7 found to recover
no copy of datafile 8 found to recover
no copy of datafile 9 found to recover
no copy of datafile 10 found to recover
no copy of datafile 11 found to recover
no copy of datafile 12 found to recover
no copy of datafile 13 found to recover
no copy of datafile 14 found to recover
no copy of datafile 15 found to recover
Finished recover at 2014-04-07 13:17:08

Starting backup at 2014-04-07 13:17:09
current log archived
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=508 RECID=9 STAMP=844262229
channel ORA_DISK_1: starting piece 1 at 2014-04-07 13:17:09
channel ORA_DISK_1: finished piece 1 at 2014-04-07 13:17:10
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/annnf0_img_cpy_upd_0.307.844262231
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
channel ORA_DISK_1: deleting archived log(s)
archived log file name=+FRA/NCDB121/ARCHIVELOG/2014_04_07/thread_1_seq_508.272.844262229
    RECID=9 STAMP=844262229
Finished backup at 2014-04-07 13:17:11

Starting backup at 2014-04-07 13:17:11
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 12 found
no parent backup or copy of datafile 2 found
no parent backup or copy of datafile 10 found
no parent backup or copy of datafile 14 found
no parent backup or copy of datafile 15 found
no parent backup or copy of datafile 8 found
no parent backup or copy of datafile 9 found
no parent backup or copy of datafile 11 found
no parent backup or copy of datafile 13 found
no parent backup or copy of datafile 5 found
no parent backup or copy of datafile 7 found
no parent backup or copy of datafile 6 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATA/NCDB121/DATAFILE/sysaux.283.836524477
channel ORA_DISK_2: starting datafile copy
input datafile file number=00001 name=+DATA/NCDB121/DATAFILE/system.282.836524551
channel ORA_DISK_3: starting datafile copy
input datafile file number=00004 name=+DATA/NCDB121/DATAFILE/undotbs1.284.836524619
channel ORA_DISK_4: starting datafile copy
input datafile file number=00012 name=+FRA/NCDB121/DATAFILE/ado_cold_data.283.841267371
output file name=+FRA/NCDB121/DATAFILE/ado_cold_data.274.844262235
    tag=IMG_CPY_UPD RECID=55 STAMP=844262264
channel ORA_DISK_4: datafile copy complete, elapsed time: 00:00:35
channel ORA_DISK_4: starting datafile copy
input datafile file number=00002 name=+DATA/NCDB121/DATAFILE/example.280.836524747
output file name=+FRA/NCDB121/DATAFILE/undotbs1.303.844262235
    tag=IMG_CPY_UPD RECID=56 STAMP=844262271
channel ORA_DISK_3: datafile copy complete, elapsed time: 00:00:41
...
{ omitted for sake of brevity }
...
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/nnsnn1_img_cpy_upd_0.295.844262321
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_4: backup set complete, elapsed time: 00:00:02
Finished backup at 2014-04-07 13:18:42

Starting backup at 2014-04-07 13:18:42
current log archived
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=509 RECID=10 STAMP=844262322
channel ORA_DISK_1: starting piece 1 at 2014-04-07 13:18:43
channel ORA_DISK_1: finished piece 1 at 2014-04-07 13:18:44
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/annnf0_img_cpy_upd_0.304.844262323
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=+FRA/NCDB121/ARCHIVELOG/2014_04_07/thread_1_seq_509.271.844262323
    RECID=10 STAMP=844262322
Finished backup at 2014-04-07 13:18:45

#####
# Results from 2nd run:
# - Two new tablespaces have been created to show how they will be wrapped
#   into the IUIC strategy automatically (as Incremental Level 0 image
#   copy backups).
# - Incremental Level 0 image copies of all datafiles now exist, but since
#   no Incremental Level 1 image copy backups yet exist with a tag of
#   img_cpy_upd, none will be applied to the existing Level 0 backups.
# - Incremental Level 1 backups of all datafiles will be taken because
#   Incremental Level 0 backups (their eventual "parents") now exist.
#####

# Create new tablespaces:
CREATE TABLESPACE rollon_staging_data
    DATAFILE '+DATA'
    SIZE 200M
    NOLOGGING;

CREATE TABLESPACE rollon_staging_idx
    DATAFILE '+DATA'
    SIZE 100M
    NOLOGGING;

RMAN> # Implement Incrementally-Updateable Image Copy (IUIC) strategy
2> # for RPO of 24 hours (i.e., RECOVERY WINDOW OF ONE DAY)
3> RUN {
4> RECOVER
5>    COPY OF DATABASE
6>    WITH TAG 'img_cpy_upd';
7> BACKUP
8>    INCREMENTAL LEVEL 1
9>    FOR RECOVER OF COPY WITH TAG 'img_cpy_upd'
10>    SECTION SIZE 100M
11>    DATABASE
12>    PLUS ARCHIVELOG DELETE INPUT;
13> }

Starting backup at 2014-04-07 13:29:50
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
no parent backup or copy of datafile 16 found
no parent backup or copy of datafile 17 found
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=+DATA/NCDB121/DATAFILE/sysaux.283.836524477
backing up blocks 1 through 12800
channel ORA_DISK_1: starting piece 1 at 2014-04-07 13:29:52
channel ORA_DISK_2: starting incremental level 1 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATA/NCDB121/DATAFILE/system.282.836524551
backing up blocks 1 through 12800
channel ORA_DISK_2: starting piece 1 at 2014-04-07 13:29:53
channel ORA_DISK_3: starting incremental level 1 datafile backup set
channel ORA_DISK_3: specifying datafile(s) in backup set
input datafile file number=00004 name=+DATA/NCDB121/DATAFILE/undotbs1.284.836524619
backing up blocks 1 through 12800
channel ORA_DISK_3: starting piece 1 at 2014-04-07 13:29:53
channel ORA_DISK_4: starting incremental level 1 datafile backup set
channel ORA_DISK_4: specifying datafile(s) in backup set
input datafile file number=00012 name=+FRA/NCDB121/DATAFILE/ado_cold_data.283.841267371
backing up blocks 1 through 12800
channel ORA_DISK_4: starting piece 1 at 2014-04-07 13:29:55
channel ORA_DISK_1: finished piece 1 at 2014-04-07 13:29:55
piece
. . .
{ omitted for sake of brevity }
. . .
handle=+FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.279.844263037
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_4: finished piece 1 at 2014-04-07 13:30:38
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/ncnnn1_img_cpy_upd_0.356.844263037
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_4: backup set complete, elapsed time: 00:00:01
Finished backup at 2014-04-07 13:30:38

Starting backup at 2014-04-07 13:30:39
current log archived
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=515 RECID=16 STAMP=844263040
channel ORA_DISK_1: starting piece 1 at 2014-04-07 13:30:40
channel ORA_DISK_1: finished piece 1 at 2014-04-07 13:30:41
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/annnf0_img_cpy_upd_0.359.844263041
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=+FRA/NCDB121/ARCHIVELOG/2014_04_07/thread_1_seq_515.358.844263039
    RECID=16 STAMP=844263040
Finished backup at 2014-04-07 13:30:43


#####
# Results from 3rd run:
# - The Incremental Level 1 backups from the 1st night's backups will now
#   be applied to their Incremental Level 0 image copy "parents."
# - Incremental Level 1 backups of any changed datafiles will be taken.
# - The Incremental Level 0 image copy backup for the two new datafiles
#   created during the 2nd night's run will not yet be updated from
#   Incremental Level 1 backups until the next night's run.
#####

RMAN> # Implement Incrementally-Updateable Image Copy (IUIC) strategy
2> # for RPO of 24 hours (i.e., RECOVERY WINDOW OF ONE DAY)
3> RUN {
4> RECOVER
5>    COPY OF DATABASE
6>    WITH TAG 'img_cpy_upd';
7> BACKUP
8>    INCREMENTAL LEVEL 1
9>    FOR RECOVER OF COPY WITH TAG 'img_cpy_upd'
10>    SECTION SIZE 100M
11>    DATABASE
12>    PLUS ARCHIVELOG DELETE INPUT;
13> }
13> }
Starting recover at 2014-04-07 14:27:31
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=496 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=13 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=38 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=486 device type=DISK
no copy of datafile 16 found to recover
no copy of datafile 17 found to recover
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile copies to recover
recovering datafile copy file number=00003 name=+FRA/NCDB121/DATAFILE/sysaux.353.844262767
channel ORA_DISK_1: restoring section 1 of 15
channel ORA_DISK_1: reading from backup piece
    +FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.338.844262993
channel ORA_DISK_2: starting incremental datafile backup set restore
channel ORA_DISK_2: specifying datafile copies to recover
recovering datafile copy file number=00001 name=+FRA/NCDB121/DATAFILE/system.357.844262767
channel ORA_DISK_2: restoring section 1 of 8
channel ORA_DISK_2: reading from backup piece
    +FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.267.844262993
channel ORA_DISK_3: starting incremental datafile backup set restore
channel ORA_DISK_3: specifying datafile copies to recover
recovering datafile copy file
    number=00004 name=+FRA/NCDB121/DATAFILE/undotbs1.354.844262767
channel ORA_DISK_3: restoring section 1 of 6
channel ORA_DISK_3: reading from backup piece
    +FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.268.844262993
...
{ omitted for sake of brevity }
...
{ omitted for sake of brevity }
...
channel ORA_DISK_4: piece handle=
    +FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.303.844263035 tag=IMG_CPY_UPD
channel ORA_DISK_4: restored backup piece 1
channel ORA_DISK_4: restore complete, elapsed time: 00:00:01
channel ORA_DISK_2: piece handle=
    +FRA/NCDB121/BACKUPSET/2014_04_07/nnndn1_img_cpy_upd_0.279.844263037 tag=IMG_CPY_UPD
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:00:00
Finished recover at 2014-04-07 14:27:49

Starting backup at 2014-04-07 14:27:50
current log archived
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=516 RECID=17 STAMP=844266471
channel ORA_DISK_1: starting piece 1 at 2014-04-07 14:27:51
channel ORA_DISK_1: finished piece 1 at 2014-04-07 14:27:52
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/annnf0_img_cpy_upd_0.360.844266471
    tag=IMG_CPY_UPD comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=
    +FRA/NCDB121/ARCHIVELOG/2014_04_07/thread_1_seq_516.358.844266471
    RECID=17 STAMP=844266471
Finished backup at 2014-04-07 14:27:52

Starting backup at 2014-04-07 14:27:52
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 12 found
no parent backup or copy of datafile 2 found
no parent backup or copy of datafile 10 found
no parent backup or copy of datafile 14 found
no parent backup or copy of datafile 15 found
no parent backup or copy of datafile 8 found
no parent backup or copy of datafile 9 found
no parent backup or copy of datafile 11 found
no parent backup or copy of datafile 13 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATA/NCDB121/DATAFILE/sysaux.283.836524477
channel ORA_DISK_2: starting datafile copy
input datafile file number=00001 name=+DATA/NCDB121/DATAFILE/system.282.836524551
channel ORA_DISK_3: starting datafile copy
input datafile file number=00004 name=+DATA/NCDB121/DATAFILE/undotbs1.284.836524619
channel ORA_DISK_4: starting datafile copy
input datafile file number=00012 name=+FRA/NCDB121/DATAFILE/ado_cold_data.283.841267371
output file name=+FRA/NCDB121/DATAFILE/ado_cold_data.363.844266475
    tag=IMG_CPY_UPD RECID=114 STAMP=844266520
channel ORA_DISK_4: datafile copy complete, elapsed time: 00:00:56
...
{ omitted for sake of brevity }
...
{ omitted for sake of brevity }
...
handle=+FRA/NCDB121/BACKUPSET/2014_04_07/ncnnn1_img_cpy_upd_0.293.844266629 tag=IMG_CPY_UPD
    comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:01
Finished backup at 2014-04-07 14:30:31

Starting backup at 2014-04-07 14:30:31
current log archived
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=517 RECID=18 STAMP=844266631
channel ORA_DISK_1: starting piece 1 at 2014-04-07 14:30:31
channel ORA_DISK_1: finished piece 1 at 2014-04-07 14:30:32
piece handle=+FRA/NCDB121/BACKUPSET/2014_04_07/annnf0_img_cpy_upd_0.301.844266633 tag=IMG_CPY_UPD
    comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=+FRA/NCDB121/ARCHIVELOG/2014_04_07/thread_1_seq_517.270.844266631
    RECID=18 STAMP=844266631
Finished backup at 2014-04-07 14:30:33

Validating RMAN Backups

You’ve developed an appropriate backup strategy for your database using RMAN, and you’ve performed sufficient backups to guarantee your database’s RPO. However, you could still encounter a serious situation in which recovery can’t be performed as you intended due to issues with your existing backup copies. For example, what if a crucial image copy backup or backup set were corrupted or damaged after their successful creation? To avoid any such situation, it is highly recommended that you test your backups frequently. In an ideal situation, testing is accomplished by restoring the backups to another host and performing a full database recovery scenario; however, this method of testing is sometimes infeasible due to the lack of facilities.

RMAN is completely capable of validating that your backups as well as your existing Oracle database are free of any possible logical or physical block corruption. The validation can be performed without actually performing the restoration of the backups and/or recovery of the database, and you will be assured of having reliable backups that can be used for database restoration and recovery.

To validate your database’s control files, all datafiles, and SPFILE (if one exists), use the following command in RMAN:

RMAN> VALIDATE DATABASE;

The following RMAN command will validate just one datafile—in this case, one of the datafiles for the SYSTEM tablespace:

RMAN> VALIDATE DATAFILE 1;

Use VALIDATE BACKUPSET when you suspect that one or more backup pieces in a backup set are missing or have been damaged. The following command checks every block in a backup set to ensure that the backup can be restored:

RMAN> run
{
    allocate channel ch1 type 'SBT_TAPE';
    VALIDATE BACKUPSET <backupset_number>;
    release channel ch1;
}

The following RMAN command script will validate that datafiles or archived redo logs that have been backed up contain no logical or physical corruption:

RMAN> run {
    BACKUP VALIDATE
    CHECK LOGICAL
    DATABASE AND ARCHIVELOG ALL;
}

Backup Optimization and Tuning

Having a good backup and recovery strategy to protect an organization’s precious data is undoubtedly one of the key goals of an Oracle DBA. At the same time, it is very important to minimize the performance impact of your backup strategy. Here are some basic optimization and tuning principles for database backup and restore/recovery operations that will help in reducing the required backup window and improve the performance of recovery operations:

Image Split backups for extremely large datafiles into multiple sections by specifying an appropriate value for the SECTION SIZE parameter. For example, this RMAN command leverages multisection backups to divide a datafile backup into multiple sections of 2 GB, which will boost the speed of both backup and restore operations:

RMAN> BACKUP SECTION SIZE 2G TABLESPACE <tablespace_name>;

Image Use the BACKUP OPTIMIZATION setting in RMAN to skip any datafile during backup or restore operations when certain conditions are met. For example, during backup, if the datafile has not changed its SCN from its last backup, RMAN won’t back up the file. Similarly, during datafile restoration, if a datafile exists in its location with a matching SCN of the datafile being restored from backup, RMAN won’t bother to attempt to restore the datafile.

Image To restrict the backup window to avoid any performance pitfalls on the server for long-running backups, use BACKUP DURATION clause.

Image Review dynamic views V$BACKUP_ASYNC_IO and V$BACKUP_SYNC_IO to identify any performance bottlenecks and learn about the status of ongoing and recently performed backup and recovery operations.

Image If your operating system doesn’t support asynchronous I/O, set a nonzero value for the DBWR_IO_SLAVE initialization parameter.

Image Adjust the tape I/O buffer size if you are creating backups on tape media or using a virtual tape library.

Image Ensure you have set the LARGE_POOL_SIZE initialization parameter to a sufficient size, especially when backing up directly to tape. Use the formula LARGE_POOL_SIZE = number_of_allocated_channels * (16MB + (4 * size_of_buffer_tape)) to determine the optimal size.

Image Allocate multiple channels to improve the database backup performance.

Image Generate an Automatic Workload Repository (AWR) report and verify the top wait events to identify any RMAN backup related wait events—for example, sbtwrite2, sbtbackup, and so on.

Image Finally, when all else fails during diagnosis of performance issues, consider using the DEBUG and TRACE parameters when invoking RMAN to get detailed information about the issue to analyze the bottleneck:

RMAN> CONFIGURE CHANNEL DEVICE TYPE DISK DEBUG=5 TRACE 1;

Tuning Disk-Based Backup Performance

To improve the performance of disk-based backups, you need to fine tune the RMAN memory-related hidden initialization parameters: _backup_disk_bufsz, _backup_disk_bufcnt, _backup_file_bufcnt, and _backup_file_bufsz. The values can be set dynamically in the RMAN’s run script or set at database level.

The following query displays the list of hidden parameters pertaining to RMAN:

SQL> select
   ksppinm,
   ksppdesc
from
   x$ksppi
where
   substr(ksppinm,1,5) = '_back'
order by
   1,2;

KSPPINM                        KSPPDESC
------------------------------ ------------------------------------------------------
_backup_align_write_io         align backup write I/Os
_backup_disk_bufcnt            number of buffers used for DISK channels
_backup_disk_bufsz             size of buffers used for DISK channels
_backup_disk_io_slaves         BACKUP Disk I/O slaves
_backup_dynamic_buffers        dynamically compute backup/restore buffer sizes
_backup_encrypt_opt_mode       specifies encryption block optimization mode
_backup_file_bufcnt            number of buffers used for file access
_backup_file_bufsz             size of buffers used for file access
_backup_io_pool_size           memory to reserve from the large pool
_backup_kgc_blksiz             specifies buffer size to be used by HIGH compression
_backup_kgc_bufsz              specifies buffer size to be used by BASIC compression
_backup_kgc_memlevel           specifies memory level for MEDIUM compression
_backup_kgc_niters             specifies number of iterations done by BASIC compression
_backup_kgc_perflevel          specifies compression (performance) level for MEDIUM compression
_backup_kgc_scheme             specifies compression scheme
_backup_kgc_type               specifies compression type used by kgc BASIC compression
_backup_kgc_windowbits         specifies window size for MEDIUM compression
_backup_ksfq_bufcnt            number of buffers used for backup/restore
_backup_ksfq_bufcnt_max        maximum number of buffers used for backup/restore
_backup_ksfq_bufsz             size of buffers used for backup/restore
_backup_lzo_size               specifies buffer size for LOW compression
_backup_max_gap_size           largest gap in an incremental/optimized backup buffer, in bytes
_backup_seq_bufcnt             number of buffers used for non-DISK channels
_backup_seq_bufsz              size of buffers used for non-DISK channels

Using RMAN for RAC Databases

Managing Oracle database backup and recovery operations through RMAN is identical for RAC and non-RAC database environments, with just a few exceptions. In a RAC environment, RMAN backup workloads can be distributed evenly across multiple instances of the RAC database to utilize node resources efficiently.

Before configuring load-balance settings for a RAC database, enable parallelism as follows:

RMAN> configure device type [sbt|disk] parallelism 2;

The following example demonstrates automatic load balancing of RMAN database backups in a RAC database with two instances. First, create a database service for load balancing:

$ srvctl add service –d RDB –s rdb_main –r rdb1,rdb2
$ srvctl start service –d RDB –s rdb_main

Next, add entries to each node’s tnsnames.ora network configuration file to leverage the new service name for load balancing:

RDB_MAIN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rdbscan)(PORT = 1522))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = RDB_MAIN)
    )
  )

The following example demonstrates an RMAN database backup script that uses a dynamic channel allocation for load balancing:

RMAN> run
{
    allocate channel ch1 connect 'sys/password@rdb_main';
    backup database;
}

Alternatively, you can allocate dedicated channels to each instance. The following example entries for tnsnames.ora permits dedicated channel allocation:

RDB1 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = prdn1)(PORT = 1522))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = RDB)
    )
  )

RDB2 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = prdn22)(PORT = 1522))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = RDB)
    )
  )

And here is an RMAN database backup script that demonstrates how to allocate specific channels to specific RAC instances for load balancing:

RMAN> run
{
    allocate channel ch1 connect 'sys/password@rdb1';
    allocate channel ch2 connect 'sys/password@rdb2';
    backup database;
}

Load balancing RMAN database backup workload across multiple instances can help prevent resource bottlenecks on the nodes while providing efficient use of resources on the node.

Finally, when using these techniques within multiple instances of a RAC database, don’t forget that it is essential to store the snapshot control file in a shared location—preferably in an Automatic Storage Management (ASM) disk group—to prevent an “ORA-00245 control file backup operation failed” error and backup failures:

RMAN> configure snapshot controlfile name to '+DG_FRAsnapcf_RDB';

Retaining Data in a Recovery Catalog

By default, RMAN stores all backup and recovery operational history in the database’s control file and retains it for seven days. To retain the backup and restore operational information for longer than the default timeframe, you can increase the CONTROL_FILE_RECORD_KEEP_TIME parameter value to a longer period, but no more than 365 days. If you are planning to retain RMAN backups for a period longer than 365 days and keep them readily available for restoration and recovery purposes, the only alternative is to create an RMAN recovery catalog. This catalog is essentially a copy of the RMAN information retained in the database’s control file, and it stores the metadata of database backup and recovery operations for a much longer period of time.

Another advantage of the recovery catalog is that if you were to lose all copies of your database’s control files, you can still restore and recover the database using the information stored in the catalog, including the control file. Additionally, you can develop RMAN scripts and store them in the catalog to use with different databases. Finally, if you are using Oracle Data Guard as your disaster recovery solution and plan to offload your backup processing to run against a corresponding physical standby database, a recovery catalog is required to capture all backup information to enable the primary database to leverage those backups.

Having a Robust Recovery Strategy

You’ve prepared a robust backup strategy, you’ve tuned your backup’s performance so that it’s optimal, and perhaps you’ve even simulated a disaster recovery via Oracle Data Guard. But if you can’t handle the recovery scenario that is thrown at you, then all of this preparation is virtually useless. It’s therefore incumbent on you to be alert and ready for any recovery situation that could arise from any vector. Following are some of the most common recovery scenarios that you might encounter and that you must be prepared to handle:

Image Incomplete recovery of an entire database

Image Complete database recovery with zero data loss

Image Recovery of corrupted or missing control files due to loss of all control file copies

Image Datafile-level and tablespace-level recovery

Image Tablespace point-in-time recovery

Image Recovery of a single table or table partition

Image Recovery of one or more database blocks

Image Recovery of a lost, damaged, or missing SPFILE

Image Loss of one or more online redo log members

Image Loss of an entire online redo log group (especially the current group!)

Image Complete disaster recovery when the entire database—all datafiles, control file copies, and SPFILE—is lost

Image Hardware, firmware, or software failures (including partial or complete loss of an Oracle Clusterware, Grid Infrastructure, or database home and related binary files)

RMAN provides you the ability to test your recovery scenario without actually performing the recovery operation. The following command verifies existing backups and confirms the smooth recovery:

RMAN> RESTORE DATABASE VALIDATE;
RMAN> RESTORE ARCHIVELOG ALL VALIDATE;

The following provides best practices for a faster database recovery:

Image Use a proper value for FILESPERSET, MAXOPENFILES, and MAXPIECESIZE.

Image Ensure that you turned on the BACKUP OTPIMIZATION setting to avoid restoring datafiles that don’t need to be restored.

Image Set the LARGE_POOL_SIZE initialization parameter value as per the recommendation previously mentioned to speed up restoration of all database files.

Image Enable channels parallelism to quickly restore many database files. For example, use the following CONFIGURE setting to specify the use of four parallel channels for all disk-based backup and restore operations:

RMAN> CONFIGURE DEVICE TYPE DISK PARALLELISM 8;

Image Leverage parallelism to quickly recover datafiles. For example, use the following RMAN command to specify the use of eight degrees of parallelism during application of redo entries during recovery:

RMAN> recover database parallel 8;

Image Use the following query to monitor the progress of the recovery operation:

SQL> SELECT name,item,sofar,total FROM v$recovery_progress;

Leveraging the Data Recovery Advisor

Not only is an Oracle DBA often tasked with managing multiple databases at varying release levels, she may also encounter a complex recovery situation that she has never encountered before but that must be resolved immediately to restore application access to a crucial database. This pressure may lead her to choose a recovery solution that takes an inordinate amount of time, or even worse, choose a totally incorrect solution.

The good news is that starting in Oracle Database 11g, the Data Recovery Advisor (DRA) tool assists in determining, detecting, and suggesting appropriate data repair options. The DRA tool is configured automatically and is quite simple to use. For example, the LIST FAILURE command shows the most recently detected failures:

RMAN> LIST FAILURE;

<example output>?

Once the DBA has decided which failure to repair, she can simply request RMAN to diagnose the failure and provide cogent advice:

RMAN> ADVISE FAILURE;

She can ask DRA to preview exactly what actions it will take to repair the detected problems:

RMAN> REPAIR FAILURE PREVIEW;

Finally, she can even ask DRA to execute the recommended repair instructions:

RMAN> REPAIR FAILURE;


Note

As of this writing, DRA supports only single-instance databases, so if you attempt to use it against a RAC database, you will encounter the following error:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of list command at 04/13/2015 13:58:43
RMAN-05533: LIST FAILURE is not supported on RAC database


Summary

This chapter explained the importance of adhering to the recommended best practices for Oracle database backup, restoration, and recovery, as well as how to design and develop appropriate strategies to meet your organization’s business demands based on the database’s RPO and RTO. We also discussed how to test and validate your backups to ensure smooth recoverability. The chapter also offered some simple backup and recovery performance tuning principles, including leveraging DRA for quick and accurate diagnosis of database recovery issues.

Here is a quick wrap-up of the top Oracle backup and recovery recommendations to fine-tune your database backup and recovery operations:

Image Configure an appropriately-sized FRA for each database.

Image Turn on BCT for faster incremental backups.

Image Validate your backups with RESTORE DATABASE VALIDATE and RESTORE ARCHIVE LOG ALL commands.

Image If no RMAN catalog is configured, increase the default for CONTROL_FILE_RECORD_KEEP_TIME to a higher value.

Image Always turn on control file auto backups with the CONFIGURE CONTROLFILE AUTOBACKUP ON; command.

Image Turn on CONFIGURE BACKUP OPTIMIZATION to avoid any unnecessary backup and restore operations.

Image Before performing a database recovery, review and evaluate RMAN’s planned recovery via the RECOVER DATABASE TEST command.

Image Validate recovery procedure and requirements with RECOVER DATABASE <DBNAME>.

Image Exclude read-only tablespaces once you back up.

Image When backing up a RAC database, remember to store the snapshot control file in a shared location.

Image Distribute the backup workload across RAC database instances.

Image Manage the length of the backup window with the BACKUP DURATION clause.

Image Consider increasing the _BACKUP_KSFQ_BUFCNT and _BACKUP_KSFQ_BUFSZ parameters when restoring datafiles to ASM disk groups.

Image Review the V$RMAN_STATUS dynamic view to monitor the backup status and backup start/end times.

Image On a regular basis, test your backup strategy via restore and recovery operations to a different database server.

Image Delete obsolete and expired backup copies either manually or automatically by configuring recovery window and FRA policies.

Image Minimize the number of datafiles per backup set.

Image Limit the RMAN backupset size to avoid having a huge backup set size.

Image Don’t rely on the default RMAN configuration; try to tune the configuration according to need.

Image Finally, the most important recommendation is to test, retest, and retest again your backup strategy against all possible recovery scenarios!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset