To understand Database reovery, a little understanding of different database failures are to be understood. Not all failures involve loss of database data. Based on the failures, the recovery operations carried out.
|Statement||A single database operation fails, such as a DML (Data Manipulation Language) statement - INSERT, UPDATE, and so on.|
|User process||A single database connection fails.|
|Network||A network component between the client and the database server fails, and the session is disconnected from the database.|
|User error||An error message is not generated, but the operation’s result, such as dropping a table, is not what the user intended.|
|Instance||The database instance fails unexpectedly.|
|Media||One or more of the database files is lost, deleted, or corrupted.|
When a database fails to run or a media failure occurs, or any of database or schema objects are lost or corrupted, a recovery process is needed. For this, an understanding of various types of database failures is essential.
There are six general categories for database-related failures. Understanding what category a failure belongs in will help you to more quickly understand the nature of the recovery effort you need to use to reverse the effects of the failure and maintain a high level of availability and performance in your database. The six general categories of failures are as follows:
Statement failures occur when a single database operation fails, such as a single INSERT statement or the creation of a table. In the list that follows are a few of the most common problems and their solutions when a statement fails.
Although granting user privileges or additional quotas within a tablespace solves many of these problems, also consider whether there are any gaps in the user education process that might lead to some of these problems in the first place.
A user closes their SQL*Plus window without logging out.
The workstation reboots suddenly before the application can be closed.
The application program causes an exception and closes before the application can be terminated normally.
A user process times out and Oracle disconnects the session.
A small percentage of user process failures is generally no cause for concern unless it becomes chronic; it may be a sign that user education is lacking—for example, training users to terminate the application gracefully before shutting down their workstation.
Depending on the locations of your workstation and your server, getting from your workstation to the server over the network might involve a number of hops: you might traverse several local switches and WAN routers to get to the database. From a network perspective, this configuration provides a number of points where failure can occur. These types of failures are called network failures.
In addition to hardware failures between the server and client, a listener process on the Oracle server can fail or the network card on the server itself can fail. To guard against these kinds of failures, you can provide redundant network paths from your clients to the server, as well as additional listener connections on the Oracle server and redundant network cards on the server.
Even if all your redundant hardware is at peak performance, and your users have been trained to disconnect from their Oracle sessions properly, users can still inadvertently delete or modify data in tables or drop an index. This is known as a user error failure. Although these operations succeed from a statement point of view, they might not be logically correct: the DROP TABLE command worked fine, but you really didn’t want to drop that table!
If data was inadvertently deleted from a table, and not yet committed, a ROLLBACK statement will undo the damage. If a COMMIT has already been performed, you have a number of options at your disposal, such as using data in the undo tablespace for a Flashback Query or using data in the archived and online redo logs with the LogMiner utility, available as a command-line or GUI interface.
You can recover a dropped table using Oracle’s recycle bin functionality: a dropped table is stored in a special structure in the tablespace and is available for retrieval as long as the space occupied by the table in the tablespace is not needed for new objects. Even if the table is no longer in the tablespace’s recycle bin, depending on the criticality of the dropped table, you can use either tablespace point in time recovery (TSPITR) or Flashback Database Recovery to recover the table, taking into consideration the potential data loss for other objects stored in the same tablespace for TSPITR or in the database if you use Flashback Database Recovery.
If the inadvertent changes are limited to a small number of tables that have few or no interdependencies with other database objects, Flashback Table functionality is most likely the right tool to bring back the table to a point of time in the past.
A few causes for instance failure:
A power outage
A server hardware failure
Failure of an Oracle background process
Emergency shutdown procedures (intentional power outage or SHUTDOWN ABORT)
In all these scenarios, the solution is easy: run the STARTUP command, and let Oracle automatically perform instance recovery using the online redo logs and undo data in the undo tablespace. If the cause of the instance failure is related to an Oracle background process failure, you can use the alert log and process-specific trace files to debug the problem. The EM Database Control makes it easy to review the contents of the alert log and any other alerts generated right before the point of failure.
Another type of failure that is somewhat out of your control is media failure. A media failure is any type of failure that results in the loss of one or more database files: datafiles, control files, or redo log files. Although the loss of other database-related files such as an init.ora file or a server parameter file (SPFILE) is of great concern, Oracle Corporation does not consider it a media failure. The database file can be lost or corrupted for a number of reasons:
Failure of a disk drive
Failure of a disk controller
Inadvertent deletion or corruption of a database file
Following the best practices by adequately mirroring control files, redo log files, and ensuring that full backups and their subsequent archived redo log files are available will keep you prepared for any type of media failure.
|Attempts to access tables without the appropriate privileges||Provide the appropriate privileges or create views on the tables and grant privileges on the view.|
|Running out of space||Add space to the tablespace, increase the user’s quota on the tablespace, or enable resumable space allocation.|
|Logic errors in applications||Work with developers to correct program errors or provide additional logic in the application to recover gracefully from unavoidable errors.|