10.1. Understanding and Configuring Recovery Components

A number of structures and events in the database directly support backup and recovery operations. The control files maintain the list of database files in the database, along with a record of the most recent database backups (if you are using RMAN for your backups). The checkpoint (CKPT) background process works in concert with the database writer (DBW n) process to manage the amount of time required for instance recovery; during instance recovery, the redo log files are used to synchronize the datafiles. For more serious types of failures such as media failures, archived redo log files are applied to a restored backup copy of a datafile to synchronize the datafiles and ensure that no committed transactions are lost. Finally, the Flash Recovery area, new to Oracle 10g, is a common area for all recovery-related files that makes your job much easier when backing up or recovering your database.

To maximize your database's availability, it almost goes without saying that you want to perform regularly scheduled backups. Most media failures require some kind of restoration of a datafile from a disk or tape backup before you can initiate media recovery.

In addition to regularly scheduled backups (see the section "Automating Backups" near the end of this chapter), you can configure a number of other features to maximize your database's availability and minimize recovery time: multiplexing control files, multiplexing redo log files, configuring the database in ARCHIVELOG mode, and using a Flash Recovery area.

10.1.1. Control Files

The control file is one of the smallest, yet one of the most critical, files in your database. Recovering from the loss of one copy of a control file is relatively straightforward; recovering from the loss of your only control file or all control files is more of a challenge and requires more advanced recovery techniques.

NOTE

Recovering from the loss of a control file is covered in Chapter 11.

In the following sections, we will give you an overview of the control file architecture as well as show you how to maximize the recoverability of the control file in the section "Multiplexing Control Files."

10.1.1.1. Control File Architecture

The control file is a relatively small (in the megabyte range) binary file that contains information about the structure of the database. You can think of the control file as a metadata repository for the physical database. It has the structure of the database—the datafiles and redo log files that constitute a database. The control file is created when the database is created and is updated with the physical changes, for example, whenever you add or rename a datafile.

The control file is updated continuously and should be available at all times. Don't edit the contents of the control file; only Oracle processes should update its contents. When you start up the database, Oracle uses the control file to identify the datafiles and redo log files and opens them. Control files play a major role when recovering a database.

The contents of the control file include the following:

  • The database name to which the control file belongs. A control file can belong to only one database.

  • The database creation time stamp.

  • The name, location, and online/offline status information of the datafiles.

  • The name and location of the redo log files.

  • Redo log archive information.

  • Tablespace names.

  • The current log sequence number, which is a unique identifier that is incremented and recorded when an online redo log file is switched.

  • The most recent checkpoint information.

NOTE

Checkpoints are discussed in more detail later in this chapter in the section "Understanding Checkpoints."

  • The beginning and ending of undo segments.

  • Recovery Manager's backup information. Recovery Manager (RMAN) is the Oracle utility you use to back up and recover databases.

The control file size is determined by the MAX clauses you provide when you create the database:

  • MAXLOGFILES

  • MAXLOGMEMBERS

  • MAXLOGHISTORY

  • MAXDATAFILES

  • MAXINSTANCES

Oracle preallocates space for these maximums in the control file. Therefore, when you add or rename a file in the database, the control file size does not change.

When you add a new file to the database or relocate a file, an Oracle server process immediately updates the information in the control file. Back up the control file after any structural changes. The log writer (LGWR) process updates the control file with the current log sequence number. CKPT updates the control file with the recent checkpoint information. When the database is in ARCHIVELOG mode, the archiver (ARC n) processes update the control file with information such as the archive log filename and log sequence number.

The control file contains two types of record sections: reusable and not reusable. RMAN information is kept in the reusable section. Items such as the names of the backup datafiles are kept in this section, and once this section fills up, the entries are reused in a circular fashion after the number of days specified by the initialization parameter CONTROL_FILE_RECORD_KEEP_ TIME is reached. Therefore, the control file can continue to grow due to new RMAN backup information recorded in the control file until it reaches CONTROL_FILE_RECORD_KEEP_TIME.

10.1.1.2. Multiplexing Control Files

Because the control file is critical for database operation, at a minimum have two copies of the control file; Oracle recommends a minimum of three copies. You duplicate the control file on different disks either by using the multiplexing feature of Oracle or by using the mirroring feature of your operating system. If you have multiple disk controllers on your server, at least one copy of the control file should reside on a disk managed by a different disk controller.

If you use the Database Configuration Assistant (DBCA) to create your database, three copies of the control files are multiplexed by default. In Figure 10.1, on the DBCA Database Storage screen, you can see that DBCA allows you to specify additional copies of the control file or change the location of the control files.

The next two sections discuss the two ways that you can implement the multiplexing feature: using a client or server-side init.ora (available before Oracle 9 i) and using the server-side SPFILE (available with Oracle 9 i and later).

Figure 10.1. DBCA Storage control files

10.1.1.2.1. Multiplexing Control Files Using init.ora

Multiplexing means keeping a copy of the same control file or other file on different disk drives, ideally on different controllers too. Copying the control file to multiple locations and changing the CONTROL_FILES parameter in the text-based initialization file init.ora to include all control file names specifies the multiplexing of the control file. The following syntax shows three multiplexed control files.

CONTROL_FILES = ('/ora01/oradata/MYDB/ctrlMYDB01.ctl',
                  '/ora02/oradata/MYDB/ctrlMYDB02.ctl',
                  '/ora03/oradata/MYDB/ctrlMYDB03.ctl')

By storing the control file on multiple disks, you avoid the risk of a single point of failure. When multiplexing control files, updates to the control file can take a little longer, but that is insignificant when compared with the benefits. If you lose one control file, you can restart the database after copying one of the other control files or after changing the CONTROL_FILES parameter in the initialization file.

When multiplexing control files, Oracle updates all the control files at the same time, but uses only the first control file listed in the CONTROL_FILES parameter for reading.

When creating a database, you can list the control file names in the CONTROL_FILES parameter, and Oracle creates as many control files as are listed. You can have a maximum of eight multiplexed control file copies.

If you need to add more control file copies, follow these steps:

  1. Shut down the database.

    SQL> SHUTDOWN NORMAL

  2. Copy the control file to more locations by using an operating system command:

    $ cp /u02/oradata/ord/control01.ctl /u05/oradata/ord/control04.ctl

  3. Change the initialization parameter file to include the new control file name(s) in the parameter CONTROL_FILES changing this:

    CONTROL_FILES='/u02/oradata/ord/control01.ctl', 
    '/u03/oradata/ord/control02.ctl', 
    '/u04/oradata/ord/control03.ctl'

    to this:

    CONTROL_FILES='/u02/oradata/ord/control01.ctl', 
    '/u03/oradata/ord/control02.ctl', 
    '/u04/oradata/ord/control03.ctl', 
    '/u05/oradata/ord/control04.ctl'

  4. Start up the instance.

    SQL> STARTUP

These procedures are somewhat similar to the procedures for recovering from the loss of a control file.

NOTE

We will provide examples of control file recovery in Chapter 11.

After creating the database, you can change the location of the control files, rename the control files, or drop certain control files. You must have at least one control file for each database. To add, rename, or delete control files, you need to follow the preceding steps. Basically, you shut down the database, use the operating system copy command (copy, rename, or delete the control files accordingly), modify the init.ora parameter file, and start up the database.

10.1.1.2.2. Multiplexing Control Files Using an SPFILE

Multiplexing using a binary SPFILE is similar to multiplexing using init.ora. The major difference is in how the CONTROL_FILES parameter is changed. Follow these steps:

  1. Alter the SPFILE while the database is still open:

    SQL> ALTER SYSTEM SET CONTROL_FILES =
          '/ora01/oradata/MYDB/ctrlMYDB01.ctl',
          '/ora02/oradata/MYDB/ctrlMYDB02.ctl',
          '/ora03/oradata/MYDB/ctrlMYDB03.ctl',
          '/ora04/oradata/MYDB/ctrlMYDB04.ctl'SCOPE=SPFILE;

    This parameter change takes effect only after the next instance restart by using the SCOPE=SPFILE qualifier. The contents of the binary SPFILE are changed immediately, but the old specification of CONTROL_FILES is used until the instance is restarted.

  2. Shut down the database:

    SQL> SHUTDOWN NORMAL

  3. Copy an existing control file to the new location:

    $ cp /ora01/oradata/MYDB/ctrlMYDB01.ctl/ora04/oradata/MYDB/ctrlMYDB04.ctl

  4. Start the instance:

    SQL> STARTUP

10.1.2. Understanding Checkpoints

The checkpoint background process controls the amount of time required for instance recovery. During a checkpoint, CKPT updates the control file and the header of the datafiles to reflect the last successful transaction by recording the last system change number (SCN). The SCN, which is a number sequentially assigned to each transaction in the database, is also recorded in the control file against the datafile name that is taken offline or made read-only.

A checkpoint occurs automatically every time a redo log file switch occurs, either when the current redo log file fills up or when you manually switch redo log files. The DBW n processes in conjunction with CKPT routinely write new and changed buffers to advance the checkpoint from where instance recovery can begin, thus reducing the MTTR.

NOTE

More information on tuning the MTTR and how often checkpointing occurs can be found in Chapter 11.

10.1.3. Redo Log Files

A redo log file records all changes to the database, in most cases before the changes are written to the datafiles.

To recover from an instance or a media failure, redo log information is required to roll datafiles forward to the last committed transaction. Ensuring that you have at least two members for each redo log file group dramatically reduces the likelihood of data loss because the database continues to operate if one member of a redo log file is lost.

NOTE

How to recover from the loss of a single redo log group member is covered in Chapter 11; recovery from the loss of an entire log group is covered in OCP: Oracle 10g Administration II Study Guide (Sybex, 2005).

In this section, we will give you an architectural overview of redo log files, as well as show you how to add redo log groups, add or remove redo log group members, and clear a redo log group in case one of the redo log group's members becomes corrupted.

10.1.3.1. Redo Log File Architecture

Online redo log files are filled with redo records. A redo record, also called a redo entry, is made up of a group of change vectors, each of which describes a change made to a single block in the database. Redo entries record data that you can use to reconstruct all changes made to the database, including the undo segments. When you recover the database by using redo log files, Oracle reads the change vectors in the redo records and applies the changes to the relevant blocks.

The LGWR process writes redo information from the redo log buffer to the online redo log files under a variety of circumstances:

  • When a user commits a transaction, even if this is the only transaction in the log buffer.

  • When the redo log buffer becomes one-third full.

  • When the buffer contains approximately 1MB of changed records. This total does not include deleted or inserted records.

NOTE

LGWR always writes its records to the online redo log file before DBW n writes new or modified database buffer cache records to the datafiles.

Each database has its own online redo log groups. A redo log group can have one or more redo log members (each member is a single operating system file). If you have a RAC configuration, in which multiple instances are mounted to one database, each instance has one online redo thread. That is, the LGWR process of each instance writes to the same online redo log files, and hence Oracle has to keep track of the instance from where the database changes are coming. Single instance configurations will have only one thread, and that thread number is 1. The redo log file contains both committed and uncommitted transactions. Whenever a transaction is committed, a system change number is assigned to the redo records to identify the committed transaction.

The redo log group is referenced by an integer; you can specify the group number when you create the redo log files, either when you create the database or when you create a redo log group after you create the database. You can also change the redo log configuration (add/drop/rename files) by using database commands. The following example shows a CREATE DATABASE command.

CREATE DATABASE "MYDB01"
. . .
LOGFILE '/ora02/oradata/MYDB01/redo01.log' SIZE 10M,
        '/ora03/oradata/MYDB01/redo02.log' SIZE 10M;

Two log file groups are created here; the first file is assigned to group 1, and the second file is assigned to group 2. You can have more files in each group; this practice is known as the multiplexing of redo log files, which we'll discuss later in this chapter in the section "Multiplexing Redo Log Files." You can specify any group number—the range will be between 1 and the parameter MAXLOGFILES. Oracle recommends that all redo log groups be the same size. The following is an example of creating the log files by specifying the group number:

CREATE DATABASE "MYDB01"
. . .
LOGFILE GROUP 1 '/ora02/oradata/MYDB01/redo01.log' SIZE 10M,
        GROUP 2 '/ora03/oradata/MYDB01/redo02.log' SIZE 10M;

10.1.3.2. Log Switch Operations

The LGWR process writes to only one redo log file group at any time. The file that is actively being written to is known as the current log file. The log files that are required for instance recovery are known as the active log files. The other log files are known as inactive. Oracle automatically recovers an instance when starting up the instance by using the online redo log files. Instance recovery can be needed if you do not shut down the database cleanly or if your database server crashes.

NOTE

How instance recovery works is discussed in more detail in Chapter 11.

The log files are written in a circular fashion. A log switch occurs when Oracle finishes writing to one log group and starts writing to the next log group. A log switch always occurs when the current redo log group is completely full and log writing must continue. You can force a log switch by using the ALTER SYSTEM command. A manual log switch can be necessary when performing maintenance on the redo log files by using the ALTER SYSTEM SWITCH LOGFILE command. Figure 10.2 shows how LGWR writes to the redo log groups in a circular fashion.

Whenever a log switch occurs, Oracle assigns a log sequence number to the new redo log group before writing to it. If there are lots of transactions or changes to the database, the log switches can occur too frequently. Size the redo log files appropriately to avoid frequent log switches. Oracle writes to the alert log file whenever a log switch occurs.

Redo log files are written sequentially on the disk, so the I/O will be fast if there is no other activity on the disk. (The disk head is always properly positioned.) Keep the redo log files on a separate disk for better performance. If you have to store a datafile on the same disk as the redo log file, do not put the SYSTEM, UNDOTBS, SYSAUX, or any very active data or index tablespace file on this disk. A commit cannot complete until a transaction's information has been written to the redo logs, so maximizing the throughput of the redo log files is a top priority.


Figure 10.2. Redo log file usage

Database checkpoints are closely tied to redo log file switches. We introduced checkpoints earlier in the chapter in the section "Understanding Checkpoints." A checkpoint is an event that flushes the modified data from the buffer cache to the disk and updates the control file and datafiles. The CKPT process updates the headers of datafiles and control files; the actual blocks are written to the file by the DBW n process. A checkpoint is initiated when the redo log file is filled and a log switch occurs, when the instance is shut down with NORMAL, TRANSACTIONAL, or IMMEDIATE, when a tablespace status is changed to read-only or put into BACKUP mode, or when other values specified by certain parameters (discussed later in this section) are reached.

You can force a checkpoint if needed, as shown here:

ALTER SYSTEM CHECKPOINT;

Forcing a checkpoint ensures that all changes to the database buffers are written to the datafiles on disk.

Another way to force a checkpoint is by forcing a log file switch:

ALTER SYSTEM SWITCH LOGFILE;

The size of the redo log affects the checkpoint performance. If the size of the redo log is smaller compared with the number of transactions, a log switch occurs often, and so does the checkpoint. The DBW n process writes the dirty buffer blocks whenever a checkpoint occurs. This situation might reduce the time required for instance recovery, but it might also affect the runtime performance. You can adjust checkpoints primarily by using the initialization parameter FAST_START_ MTTR_TARGET. This parameter replaces the deprecated parameters FAST_START_IO_TARGET and LOG_CHECKPOINT_TIMEOUT in previous versions of the Oracle database. It is used to ensure that recovery time at instance startup (if required) will not exceed a certain number of seconds.

Real World Scenario: Redo Log Troubleshooting

In the case of redo log groups, it's best to be generous with the number of groups and the number of members for each group. After estimating the number of groups that would be appropriate for your installation, add one more. I can remember many database installations in which I was trying to be overly cautious about disk space usage, not putting things into perspective and realizing that the slight additional work involved in maintaining either additional or larger redo logs is small in relation to the time needed to fix a problem when the number of users and concurrent active transactions increase.

The space needed for additional log file groups is minimal and is well worth the effort up front to avoid the undesirable situation in which writes to the redo log file are waiting on the completion of writes to the database files or the archived log file destination.


NOTE

More information on adjusting FAST_START_MTTR_TARGET can be found in Chapter 11.

10.1.3.3. Multiplexing Redo Log Files

You can keep multiple copies of the online redo log file to safeguard against damage to these files. When multiplexing online redo log files, LGWR concurrently writes the same redo log information to multiple identical online redo log files, thereby eliminating a single point of redo log failure. All copies of the redo file are the same size and are known as a group, which is identified by an integer. Each redo log file in the group is known as a member. You must have at least two redo log groups for normal database operation.

When multiplexing redo log files, keeping the members of a group on different disks is preferable so that one disk failure will not affect the continuing operation of the database. If LGWR can write to at least one member of the group, database operation proceeds as normal; an entry is written to the alert log file. If all members of the redo log file group are not available for writing, Oracle hangs, crashes, or shuts down. An instance recovery or media recovery can be needed to bring up the database, and you can lose committed transactions.

You can create multiple copies of the online redo log files when you create the database. For example, the following statement creates two redo log file groups with two members in each:

CREATE DATABASE "MYDB01"
. . .
LOGFILE
  GROUP 1 ('/ora02/oradata/MYDB01/redo0101.log',
           '/ora03/oradata/MYDB01/redo0102.log') SIZE 10M,
  GROUP 2 ('/ora02/oradata/MYDB01/redo0201.log',
           '/ora03/oradata/MYDB01/redo0202.log') SIZE 10M;

The maximum number of log file groups is specified in the clause MAXLOGFILES, and the maximum number of members is specified in the clause MAXLOGMEMBERS. You can separate the filenames (members) by using a space or a comma.

In the following sections, we will show you how to create a new redo log group, add a new member to an existing group, rename a member, and drop a member from an existing group. In addition, we'll show you how to drop a group and clear all members of a group in certain circumstances.

10.1.3.3.1. Creating New Groups

You can create and add more redo log groups to the database by using the ALTER DATABASE command. The following statement creates a new log file group with two members:

ALTER DATABASE ADD LOGFILE
  GROUP 3 ('/ora02/oradata/MYDB01/redo0301.log',
           '/ora03/oradata/MYDB01/redo0302.log') SIZE 10M;

If you omit the GROUP clause, Oracle assigns the next available number. For example, the following statement also creates a multiplexed group:

ALTER DATABASE ADD LOGFILE
      ('/ora02/oradata/MYDB01/redo0301.log',
       '/ora03/oradata/MYDB01/redo0302.log') SIZE 10M;

To create a new group without multiplexing, use the following statement:

ALTER DATABASE ADD LOGFILE
     '/ora02/oradata/MYDB01/redo0301.log' REUSE;

You can add more than one redo log group by using the ALTER DATABASE command—just use a comma to separate the groups.

If the redo log files you create already exist, use the REUSE option, and don't specify the size. The new redo log size will be the same as that of the existing file.


Adding a new redo log group is straightforward using the EM Database Control interface. To do so, click the Administration tab on the database home page, and then click the Redo Log Groups link. You can view and add another redo log group, as you can see in Figure 10.3 on the Redo Log Groups screen.

Figure 10.3. The Redo Log Groups maintenance screen

10.1.3.3.2. Adding New Members

If you forgot to multiplex the redo log files when creating the database (multiplexing redo log files is the default when you use DBCA) or if you need to add more redo log members, you can do so by using the ALTER DATABASE command. When adding new members, you do not specify the file size, because all group members will have the same size.

If you know the group number, use the following statement to add a member to group 2:

ALTER DATABASE ADD LOGFILE MEMBER
 '/ora04/oradata/MYDB01/redo0203.log' TO GROUP 2;

You can also add group members by specifying the names of other members in the group, instead of specifying the group number. Specify all the existing group members with this syntax:

ALTER DATABASE ADD LOGFILE MEMBER
 '/ora04/oradata/MYDB01/redo0203.log' TO
('/ora02/oradata/MYDB01/redo0201.log',
 '/ora03/oradata/MYDB01/redo0202.log'),

You can add a new member to a group in the EM Database Control by clicking the Edit button in Figure 10.3 and then clicking Add.

10.1.3.3.3. Renaming Log Members

If you want to move the log file member from one disk to another or just want a more meaningful name, you can rename a redo log member. Before renaming the online redo log members, the new (target) online redo files should exist. The SQL commands in Oracle change only the internal pointer in the control file to a new log file; they do not change or rename the operating system file. You must use an operating system command to rename or move the file. Follow these steps to rename a log member:

  1. Shut down the database.

  2. Copy/rename the redo log file member to the new location by using an operating system command.

  3. Start up the instance and mount the database (STARTUP MOUNT).

  4. Rename the log file member in the control file. Use ALTER DATABASE RENAME FILE 'old_redo_file_name' TO 'new_redo_file_name';.

  5. Open the database (ALTER DATABASE OPEN).

  6. Back up the control file.

Another way to achieve the same result is to add a new member to the group and then drop the old member from the group, as discussed in the "Adding New Members" section earlier in this chapter and the "Dropping Redo Log Groups" section, which is next.

You can rename a log group member in the EM Database Control by clicking the Edit button in Figure 10.3, clicking Edit again, and then changing the file name in the File Name box.

10.1.3.3.4. Dropping Redo Log Groups

You can drop a redo log group and its members by using the ALTER DATABASE command. Remember that you should have at least two redo log groups for the database to function normally. The group that is to be dropped should not be the active group or the current group— that is, you can drop only an inactive log file group. If the log file to be dropped is not inactive, use the ALTER SYSTEM SWITCH LOGFILE command.

To drop the log file group 3, use the following SQL statement:

ALTER DATABASE DROP LOGFILE GROUP 3;

When an online redo log group is dropped from the database, the operating system files are not deleted from disk. The control files of the associated database are updated to drop the members of the group from the database structure. After dropping an online redo log group, make sure that the drop is completed successfully, and then use the appropriate operating system command to delete the dropped online redo log files.

You can delete an entire redo log group in the EM Database Control by clicking the Edit button (see Figure 10.3, shown earlier) and then clicking the Delete button.

10.1.3.3.5. Dropping Redo Log Members

In much the same way that you drop a redo log group, you can drop only the members of an inactive redo log group. Also, if there are only two groups, the log member to be dropped should not be the last member of a group. Each redo log group can have a different number of members, though this is not advised. For example, say you have three log groups, each with two members. If you drop a log member from group 2, and a failure occurs to the sole member of group 2, the instance will hang, crash, and potentially cause loss of committed transactions when attempts are made to write to the missing redo log group, as we discussed earlier in this chapter. Even if you drop a member for maintenance reasons, ensure that all redo log groups have the same number of members.

To drop a redo log member, use the DROP LOGFILE MEMBER clause of the ALTER DATABASE command:

ALTER DATABASE DROP LOGFILE MEMBER
 '/ora04/oradata/MYDB01/redo0203.log';

The operating system file is not removed from the disk; only the control file is updated. Use an operating system command to delete the redo log file member from disk.

If a database is running in ARCHIVELOG mode, redo log members cannot be deleted unless the redo log group has been archived.


You can drop a member of a redo log group in the EM Database Control by clicking the Edit button (see Figure 10.3, shown earlier), selecting the member to be dropped, and then clicking the Remove button.

10.1.3.3.6. Clearing Online Redo Log Files

Under certain circumstances, a redo log group member (or all members of a log group) can become corrupted. To solve this problem, you can drop and re-add the log file group or group member. It is much easier, however, to use the ALTER DATABASE CLEAR LOGFILE command. The following example clears the contents of redo log group 3 in the database:

ALTER DATABASE CLEAR LOGFILE GROUP 3;

Another distinct advantage of this command is that you can clear a log group even if the database has only two log groups and only one member in each group. You can also clear a log group member even if it has not been archived by using the UNARCHIVED keyword. In this case, it is advisable to do a full database backup at the earliest convenience, because the unarchived redo log file is no longer usable for database recovery.

10.1.4. Archived Redo Log Files

If you only use online redo log files, your database is protected against instance failure but not media failure. Although saving the redo log files before they are overwritten takes additional disk space and management, the increased recoverability of the database outweighs the slight additional overhead and maintenance costs.

In this section, we will present an overview of how archived redo log files work, how to set the location for saving the archived redo log files, and how to enable archiving in the database.

10.1.4.1. Archived Redo Log File Architecture

An archived redo log file is a copy of a redo log file before it is overwritten by new redo information. Because the online redo log files are reused in a circular fashion, you have no way of bringing a backup of a datafile up to the latest committed transaction unless you configure the database in ARCHIVELOG mode.

The process of copying is called archiving. ARC n does this archiving. By archiving the redo log files, you can use them later to recover a database, update a standby database, or use the LogMiner utility to audit the database activities.

NOTE

More information on how to use archived redo log files in a recovery scenario can be found in Chapter 11.

When an online redo log file is full, and LGWR starts writing to the next redo log file, ARC n copies the completed redo log file to the archive destination. It is possible to specify more than one archive destination. The LGWR process waits for the ARC n process to complete the copy operation before overwriting any online redo log file. As with LGWR, the failure of one of the ARC n backup processes will cause instance failure, but no committed transactions will be lost because the "Commit complete" message is not returned to the user or calling program until LGWR successfully records the transaction in the online redo log file group.

Real World Scenario: Archive Logging Space Issues

After you configure the database for ARCHIVELOG mode, your job is only half complete. You need to continually make sure that there is enough room for the archived log files, otherwise the database will hang. At least once in every DBA's career, he or she will get a phone call from some users saying that the database has "crashed," while other users are still using the database. It's not until you check the alert log that you discover the archiving process cannot find disk space for a newly filled log file in the archiving destinations.

There should be enough space available for online archived redo log files to recover and roll forward from the last full backup of each datafile that is also online; the remaining archived logs and any previous datafile backups can be moved to another disk or to tape.

Remembering your zero transaction loss strategy (which should be every DBA's strategy), make sure that you do not misplace or delete an archived log file before it is backed up to tape, otherwise you will not be able to perform a complete recovery due to a media failure.

If you use RMAN and the Flash Recovery area for all of your backup files, then you can further automate this process by directing RMAN to maintain enough backups to satisfy a recovery window policy (number of days) or a redundancy policy (multiple copies of each backup). Once an archived log or other backup file is no longer needed for the policy, the files are automatically deleted from the Flash Recovery area.


When the archiver process is copying the redo log files to another destination, the database is said to be in ARCHIVELOG mode. If archiving is not enabled, the database is said to be in NOARCHIVELOG mode. In production systems, you cannot afford to lose data and should therefore run the database in ARCHIVELOG mode so that in the event of a failure, you can recover the database to the time of failure or to a point in time. You can achieve this ability to recover by restoring the database backup and applying the database changes by using the archived log files.

10.1.4.2. Setting the Archive Destination

You specify the archive destination in the initialization parameter file. To change the archive destination parameters during normal database operation, you use the ALTER SYSTEM command. Here are some of the parameters associated with archive log destinations and the archiver process:

LOG_ARCHIVE_DEST_n Using this parameter, you can specify at most 10 archiving destinations. These locations can be on the local machine or on a remote machine where the standby database is located. The syntax for specifying this parameter in the initialization file is as follows:

LOG_ARCHIVE_DEST_n = "null_string" |
((SERVICE = tnsnames_name |

LOCATION = 'directory_name')
[MANDATORY | OPTIONAL]
[REOPEN [= integer]])

For example:

LOG_ARCHIVE_DEST_1 = ((LOCATION='/archive/MYDB01') MANDATORY
  REOPEN = 60)

specifies a location for the archive log files on the local machine at /archive/MYDB01. The MANDATORY clause specifies that writing to this location must succeed. The REOPEN clause specifies when the next attempt to write to this location should be made, when the first attempt did not succeed. The default value is 300 seconds.

Here is another example, which applies the archive logs to a standby database on a remote computer.

LOG_ARCHIVE_DEST_2 = (SERVICE=STDBY01) OPTIONAL REOPEN;

Here STDBY01 is the Oracle Net connect string used to connect to the remote database. Because writing is optional, the database activity continues even if ARC n could not write the archive log file. It tries the writing operation again because the REOPEN clause is specified.

You can also use the EM Database Control web pages to configure the log filenaming and destinations by clicking Configure Recovery Settings in the Maintenance tab. The first destination is on the file system at /u09/oradata/arch01, as you can see in Figure 10.4.

Destination number 10 is the Flash Recovery area using the string USE_DB_RECOVERY_FILE_DEST.

Figure 10.4. The log archive destinations

NOTE

We will tell you about the Flash Recovery area in the section "The Flash Recovery Area."

LOG_ARCHIVE_MIN_SUCCEED_DEST This parameter specifies the number of destinations that the ARC n process should successfully write at a minimum to proceed with overwriting the online redo log files. The default value of this parameter is 1. This parameter cannot exceed the total number of enabled destinations. If this parameter value is less than the number of MANDATORY destinations, the parameter is ignored.

LOG_ARCHIVE_FORMAT This parameter specifies the format in which to write the filename of the archived redo log files. To ensure that the log files are not overwritten, you use predefined substitution variables to construct the name of each archived redo log file. You can provide a text string and any of the predefined substitution variables. The variables are as follows:

%sLog sequence number
%tThread number
%rResetlogs ID: ensures uniqueness even after using advanced recovery techniques that resets the log sequence numbers
%dDatabase ID

The format you provide must include at least %s, %t, and %r. If you use the same archived redo log location for multiple databases, you must also use %d. In Figure 10.4 (shown earlier), the log archive filename format is defined as %t_%s_%r.dbf.

10.1.4.3. Setting ARCHIVELOG

Specifying these parameters does not start writing the archive log files. To enable archiving of the redo log files, place the database in ARCHIVELOG mode. You can specify the ARCHIVELOG clause while creating the database. However, you might prefer to create the database first and then enable ARCHIVELOG mode. To enable ARCHIVELOG mode, follow these steps:

  1. Shut down the database. Set up the appropriate initialization parameters.

  2. Start up and mount the database; ARCHIVELOG mode can be changed only when the database is in the MOUNT state.

    NOTE

    Details on instance startup can be found in Chapter 11.

  3. Enable ARCHIVELOG mode by using the command ALTER DATABASE ARCHIVELOG.

  4. Open the database by using ALTER DATABASE OPEN.

To disable ARCHIVELOG mode, follow these steps:

  1. Shut down the database.

  2. Start up and mount the database.

  3. Disable ARCHIVELOG mode by using the command ALTER DATABASE NOARCHIVELOG.

  4. Open the database by using ALTER DATABASE OPEN.

The dynamic performance view V$DATABASE tells you whether you are in ARCHIVELOG mode, as you can see in this query:

SQL> select dbid, name, created, log_mode
  2      from v$database;

      DBID NAME      CREATED   LOG_MODE
---------- --------- --------- ------------
1387044942 ORD       03-MAR-04 ARCHIVELOG

1 row selected.

10.1.5. The Flash Recovery Area

As the price of disk space drops, the difference in its price compared with tape is offset by the advantages of using disk as the primary backup medium: Even a slow disk can be accessed randomly a magnitude faster than a tape drive. This rapid access means that any database recovery operation takes only minutes instead of hours.

Using disk space as the primary medium for all database recovery operations is the key component of Oracle 10g's Flash Recovery area. The Flash Recovery area is a single, unified storage area for all recovery-related files and recovery activities in an Oracle database.

The Flash Recovery area can be a single directory, an entire file system, or an Automatic Storage Management (ASM) disk group. To further optimize the use of disk space for recovery operations, a Flash Recovery area can be shared by more than one database.

In the following sections, we will cover all major aspects of a Flash Recovery area: what can and should be kept in the Flash Recovery area and how to set up a Flash Recovery using initialization parameters and SQL commands. Also, as with other aspects of Oracle 10g, we will show how you can manage most parts of the Flash Recovery area using the Enterprise Manager (EM) Database Control, and we'll introduce some of the more advanced management techniques.

10.1.5.1. Flash Recovery Area Occupants

All the files needed to recover a database from a media failure or a logical error are contained in the Flash Recovery area. The Flash Recovery area can contain the following:

Control files A copy of the control file is created in the Flash Recovery area when the database is created. This copy of the control file can be used as one of the mirrored copies of the control file to ensure that at least one copy of the control file is available after a media failure.

Archived log files When the Flash Recovery area is configured, the initialization parameter LOG_ ARCHIVE_DEST_10 is automatically set to the Flash Recovery area location. The corresponding ARC n processes create archived log files in the Flash Recovery area and any other defined LOG_ ARCHIVE_DEST_n locations.

Flashback logs If flashback database is enabled, its flashback logs are stored in the Flash Recovery area.

NOTE

You can find more information about configuring and using flashback logs with flashback database in OCP: Oracle 10g Administration II Study Guide (Sybex, 2005).

Control file and SPFILE autobackups The Flash Recovery area holds control file and SPFILE autobackups generated by RMAN if RMAN is configured for control file autobackup. When RMAN backs up datafile #1, which is part of the SYSTEM tablespace, the control file is automatically included in the RMAN backup.

Datafile copies For RMAN BACKUP AS COPY image files, the default destination for the datafile copies is the Flash Recovery area.

NOTE

You can find more information on RMAN image copy functions later in this chapter in the section "Using RMAN to Create Backups."

RMAN backup sets By default, RMAN uses the Flash Recovery area for both backup sets and image copies. In addition, RMAN puts restored archived log files from tape into the Flash Recovery area in preparation for a recovery operation.

10.1.5.2. The Flash Recovery Area and SQL Commands

You must define two initialization parameters to set up the Flash Recovery area: DB_RECOVERY_ FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST. Because both of these are dynamic parameters, the instance need not be shut down and restarted for the Flash Recovery area to be usable.

DB_RECOVERY_FILE_DEST_SIZE, which must be defined before DB_RECOVERY_FILE_DEST, defines the size of the Flash Recovery area. To maximize the benefits of the Flash Recovery area, it should be large enough to hold a copy of all datafiles, all incremental backups, online redo logs, archived redo logs not yet backed up to tape, control files, and control file autobackups. At a bare minimum, you need enough space to hold the archived log files not yet copied to tape.

Here is an example of configuring DB_RECOVERY_FILE_DEST_SIZE:

SQL> alter system
  2   set db_recovery_file_dest_size = 8g scope=both;

The size of the Flash Recovery area will be 8GB, and because we used the SCOPE=BOTH parameter in the ALTER SYSTEM command, the initialization parameter takes effect immediately and stays in effect even after a database restart.

NOTE

All instances in an RAC database must have the same values for DB_RECOVERY_ FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST.

The parameter DB_RECOVERY_FILE_DEST specifies the physical location where all Flash Recovery files are stored. The ASM disk group or file system must have at least as much space as the amount specified with DB_RECOVERY_FILE_DEST_SIZE, and it can have significantly more. DB_RECOVERY_FILE_DEST_SIZE, however, can be increased on the fly if more space is needed and the file system where the Flash Recovery area resides has the space available.

In the following example, we use the directory /OraFlash for the Flash Recovery area, like so:

SQL> alter system
  2     set db_recovery_file_dest = '/OraFlash' scope=both;

Clearing the value of DB_RECOVERY_FILE_DEST disables the Flash Recovery area; the parameter DB_RECOVERY_FILE_DEST_SIZE cannot be cleared until the DB_RECOVERY_FILE_DEST parameter has been cleared.

10.1.5.3. The Flash Recovery Area and the EM Database Control

You can create and maintain the Flash Recovery area using the EM Database Control. Click the Maintenance tab, and then click the Configure Recovery Settings link to display the Configure Recovery Settings screen, as shown in Figure 10.5.

Figure 10.5. The Configure Recovery Settings screen

In the Flash Recovery Area section, the Flash Recovery area has been configured for this database in the file system /OraFlash, with a maximum size of 15,000MB (15GB). Just more than 3GB of space is currently used in the Flash Recovery area. Flashback logging has not yet been enabled for this database.

10.1.5.4. Flash Recovery Area Management

Because the space in the Flash Recovery area is limited by the initialization parameter DB_ RECOVERY_FILE_DEST_SIZE, the Oracle database keeps track of which files are no longer needed on disk so that they can be deleted when there is not enough free space for new files. Each time a file is deleted from the Flash Recovery area, a message is written to the alert log.

A mssage is also written to the alert log in other circumstances. If no files can be deleted, and the recovery area used space is at 85 percent, a warning message is issued. When the space used is at 97 percent, a critical warning is issued. These warnings are recorded in the alert log file, can be viewed in the data dictionary view DBA_OUTSTANDING_ALERTS, and are available in the main screen of the EM Database Control.

When you receive these alerts, you have a number of options. If your retention policy can be adjusted to keep fewer copies of datafiles or reduce the number of days in the recovery window, this can help alleviate the space problems in the Flash Recovery area. Assuming that your retention policy is sound, you should instead add more disk space or back up some of the files in the Flash Recovery area to another destination such as another disk or a tape device.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset