When the database is started (at the startup) Oracle automatically carries out automatic recovery operation internally.
It is important to understand how an Oracle instance starts up and what kinds of failures can occur at each startup phase. Understanding the startup phases is important because some types of recovery operations must occur in a particular phase. Once a database is started, the instance will fail under a number of conditions that we will describe in detail.
Starting up a database involves several phases, from being shut down to being open and available to users. If certain prerequisites are not present, database startup halts, and you must take some kind of remedial action to permit startup to proceed. In the following list are the four basic database states along with their prerequisites after you type the STARTUP command at the SQL*Plus prompt.
No background processes are active. A STARTUP command is used when the database is in this state; the STARTUP command fails if you are in any other state unless you are using STARTUP FORCE to restart an instance.
Also known as the STARTED state, the instance must be able to access the initialization parameter file, either as a text-based init.ora file or an SPFILE.
In this state, the instance checks that all control files listed in the initialization parameter file are present and identical. Even if one of the multiplexed control files is unavailable or corrupted, the instance does not enter the MOUNT state and stays in the NOMOUNT state.
Most of the time spent in instance startup occurs during this phase. All redo log groups must have at least one member available, and all datafiles that are marked as online must be available.
You are notified in a number of ways that a redo log group member is missing or a datafile is missing. If a datafile is missing or corrupted, you will get a message while you are running the STARTUP command, as in this example:
ORACLE instance started.
Total System Global Area 197132288 bytes
Fixed Size 778076 bytes
Variable Size 162537636 bytes
Database Buffers 33554432 bytes
Redo Buffers 262144 bytes
ORA-01157: cannot identify/lock data file 4 see DBWR trace file
ORA-01110: data file 4: '/u05/oradata/ord/users01.dbf'
The message in SQL*Plus shows only the first datafile that needs attention. You will
have to use the dynamic performance view V$RECOVER_FILE to display a list of all files
that need attention. Here is a query against the view V$RECOVER_FILE and a second
query joining V$RECOVER_ FILE and V$DATAFILE given the previous STARTUP command:
SQL> select file#, error from v$recover_file;
4 FILE NOT FOUND
11 FILE NOT FOUND
SQL> select file#, name from
2 v$datafile join v$recover_file using (file#);
If a datafile is offline or taken offline, the instance can still start as long as the datafile
does not belong to the SYSTEM or UNDO tablespace. Once the instance is started, you
can proceed to recover the missing or corrupted datafile and subsequently bring it
online. If all files are available, but out of synch, automatic instance recovery is
performed as long as the online redo log files can bring all datafiles to the same SCN.
Otherwise, media recovery is required using archived redo log files.
If a redo log group member is missing, a message is generated in the alert log, but the database will still open.
Media failures are not always critical, depending on which type of datafile is lost. If any of the multiplexed copies of the control file are lost, an entire redo log group, or any datafile from the SYSTEM or UNDO tablespace, the instance will fail.
In some cases, the instance becomes unavailable to users but will not shut down; in this case, you can use SHUTDOWN ABORT to force the instance to shut down without resynchronizing the datafiles with the control file. The next time the instance is started, instance recovery will be performed. If you plan on starting up the instance right after using SHUTDOWN ABORT, you can instead use STARTUP FORCE as shorthand for a SHUTDOWN ABORT and a STARTUP.
Later in this chapter, we will show you how to recover from the loss of a control file, a redo log file member, or one or more datafiles.
As we discussed earlier in this chapter in the section "Instance Failures," an instance failure is any kind of failure that prevents the synchronization of the database’s datafiles and control file before the instance is shut down.
Oracle automatically recovers from instance failure during instance recovery. Instance recovery is initiated by simply starting up the database with the STARTUP command.
Instance recovery is also known as crash recovery.
During a STARTUP operation, Oracle first attempts to read the initialization file, and then it mounts the control file and attempts to open the datafiles referenced in the control files. If the data files are not synchronized, instance recovery is initiated.
We discussed instance startup phases in the section "Understanding Instance Startup" earlier in this chapter.
Instance recovery occurs in two distinct phases: the first phase uses the online redo log files to restore the datafiles to the state before instance failure in a roll forward operation; after this step is completed, Oracle uses the undo tablespace to roll back any uncommitted transactions. The roll forward operation includes data in the undo tablespace; without a consistent undo tablespace, the roll back operation cannot succeed. Once the roll forward operation completes, the database is open to users while the roll back operation completes. After the roll back phase, the datafiles contain only committed data.
Before a user receives a "Commit complete" message, the new or changed data must first be successfully written to a redo log file. At some point in the future, the same information must be used to update the datafiles; this operation usually lags behind the redo log file write because sequential writes to the redo log file are by nature faster than random writes to one or more datafiles on disk.
As we discussed in Chapter 10, checkpoints keep track of what still needs to be written from the redo log files to the datafiles. Any transactions not yet written to the datafiles are at an SCN after the last checkpoint.
The amount of time required for instance recovery depends on how long it takes to bring the datafiles up-to-date from the last checkpoint position to the latest SCN in the control file. To prevent performance problems, the distance between the checkpoint position and the end of the redo log group cannot be more than 90 percent of the size of the redo log group.
You can tune instance recovery by setting an MTTR target, in seconds, using the initialization parameter FAST_START_MTTR_TARGET. The default value for this parameter is zero; the maximum is 3,600 seconds (1 hour).
With a setting of zero, which disables the target, the likelihood that writes to the redo logs wait for writes to the datafiles is reduced. However, if FAST_START_MTTR_TARGET is set to a low nonzero value, writes to the redo logs most likely wait for writes to the datafiles. Although this reduces the amount of time it takes to recover the instance in the case of an instance failure, it affects performance and response time. Setting this value too high can result in an unacceptable amount of time needed to recover the instance after an instance failure.
LOG_CHECKPOINT_TIMEOUT This is the maximum number of seconds that any new or modified block in the buffer cache waits until it is written to disk.
FAST_START_IO_TARGET This is similar to FAST_START_MTTR_TARGET, except that the recovery operation is specified as the number of I/Os instead of the number of seconds to finish instance recovery.
Setting either of these parameters overrides FAST_START_MTTR_TARGET. As part of the enhanced manageability features introduced with Oracle9i, setting FAST_START_MTTR_TARGET is the easiest and most straightforward way to define your database’s recovery time given the time-based constraints included in most typical service-level agreements (SLAs).
FAST_START_MTTR_TARGET From the Advisor Central screen, accessible at the bottom of the Database Control home page, click MTTR Advisor. In the example you adjust the desired MTTR value to 60 seconds on the Configure Recovery Settings screen.
When you click the Apply button, the new value for FAST_START_MTTR_TARGET goes into effect immediately and stays in effect when the instance is restarted.
Using the SQL*Plus command line, you can accomplish this task by using the ALTER SYSTEM command, as in this example: SQL> alter system set fast_start_mttr_target=60 scope=both; System altered.
Using SCOPE=BOTH, the new value of the parameter takes effect immediately and stays in effect the next time the instance is restarted.
Earlier in this chapter, in the section "User Error Failures," we presented a number of scenarios in which a user’s data was inadvertently changed or deleted or a table was dropped. In the following sections, we’ll show you how to use Flashback Query to retrieve selected rows from a previous state of a table, how to recover a table using Flashback Drop and a tablespace’s recycle bin, how to bring back an entire table and its dependent objects (such as indexes) back to a point of time in the past using Flashback Table, and query previous transactions in the online and archived redo logs using the LogMiner utility.