The Database Recovery Advisor (DRA) is a new feature introduced in Oracle 11g to help us to diagnose any persistent failure with the database.
The main advantage of this feature is, it helps us to fix the issue taking away some of the stress always associated with this type of operation.
Database Recovery Advisor is useful only if we use RMAN for our backups.
The main benefits of this feature DRA is below.,
1. It simplifies the diagnosis, analysis and recovery steps we perform.
2. It allows us to easily detect and check the failures happened.
3. It reduces the recovery time.
4. Reduces Human Errors.
5. It provides us with suggestions, steps and advices to repair the failure occur.
6. We can generate a repair script and recover the failure using the generated script on requirement.
Database Recovery Advisor can be used via OEM or through RMAN command line interface. The below are some of the commands used in RMAN command line interface.
1.
LIST FAILURE : List all failures that the database encountered.
2.
ADVISE FAILURE : Analyzes all the backups and provide a recovery script to repair the failure occur.
3.
CHANGE FAILURE : This command changes the priority or status of the failures.
4.
REPAIR FAILURE : Repairs the failure occured with the automatically generated script (which is previously generated by the ADVISE FAILURE command).
As I mentioned earlier, Database Recovery Advisor is useful only if we use RMAN for our backups. I have taken full backups before working on the scenarios.
SCENARIO - Recovery of DATAFILE:
RMAN> report need backup;
RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of files with less than 1 redundant backups
File #bkps Name
---- ----- -----------------------------------------------------
As I have taken the backups before, the RMAN requires no further backup is required.
I have taken the USERS tablespace offline and removed it's datafile as below.,
RMAN> sql 'alter tablespace users offline';
sql statement: alter tablespace users offline
starting full resync of recovery catalog
full resync complete
[oracle@localhost ~]$ asmcmd
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 4096 1048576 10236 7784 0 7784 0 N DATA_GRP/
MOUNTED EXTERN N 512 4096 1048576 10236 9855 0 9855 0 N FRA_GRP/
ASMCMD> cd DATA_GRP/PRODDB/DATAFILE
ASMCMD> ls
EXAMPLE.265.848660933
SYSAUX.257.848660835
SYSTEM.256.848660835
TBSRMAN.267.848663633
UNDOTBS1.258.848660835
USERS.259.848660835
ASMCMD> rm USERS.259.848660835
ASMCMD> ls
EXAMPLE.265.848660933
SYSAUX.257.848660835
SYSTEM.256.848660835
TBSRMAN.267.848663633
UNDOTBS1.258.848660835
From the above list we can clearly see that the USERS datafile has been removed.
Now we are going to recover this datafile using the DATABASE RECOVERY ADVISOR.
RMAN> list failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
594 HIGH OPEN 03-JUL-14 Tablespace 4: 'USERS' is offline
588 HIGH OPEN 03-JUL-14 One or more non-system datafiles are missing
582 HIGH OPEN 03-JUL-14 One or more non-system datafiles are offline
We can see that the LIST FAILURE command lists all the failures detected by the DRA. We can see the details of the failure by issuing the command LIST FAILURE <failure_id> DETAIL as below,
RMAN> list failure 588 detail;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
588 HIGH OPEN 03-JUL-14 One or more non-system datafiles are missing
Impact: See impact for individual child failures
List of child failures for parent failure ID 588
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
591 HIGH OPEN 03-JUL-14 Datafile 4: '+DATA_GRP/proddb/datafile/users.259.848660835' is missing
Impact: Some objects in tablespace USERS might be unavailable
Now we will issue the ADVISE FAILURE command, so that the DRA suggests / advices to fix all the current failures.
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
588 HIGH OPEN 03-JUL-14 One or more non-system datafiles are missing
Impact: See impact for individual child failures
List of child failures for parent failure ID 588
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
591 HIGH OPEN 03-JUL-14 Datafile 4: '+DATA_GRP/proddb/datafile/users.259.848660835' is missing
Impact: Some objects in tablespace USERS might be unavailable
analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=38 device type=DISK
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
1. If file +DATA_GRP/proddb/datafile/users.259.848660835 was unintentionally renamed or moved, restore it
Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Restore and recover datafile 4
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm
From the above details, we can clearly see that the ADVISE FAILURE automatically command generated a repair script and stored it in the location /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm
If needed we can use the command REPAIR FAILURE PREVIEW to see the details of the script generated by the ADVISE FAILURE command before we order the DRA to repair the failures.
RMAN> repair failure preview;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm
contents of repair script:
# restore and recover datafile
restore datafile 4;
recover datafile 4;
Now issue the REPAIR FAILURE command to repair the failures.
RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm
contents of repair script:
# restore and recover datafile
restore datafile 4;
recover datafile 4;
Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script
Starting restore at 03-JUL-14
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to +DATA_GRP/proddb/datafile/users.259.848660835
channel ORA_DISK_1: reading from backup piece /rman/bkup_0kpbok52_1_1
channel ORA_DISK_1: piece handle=/rman/bkup_0kpbok52_1_1 tag=TAG20140625T210329
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 03-JUL-14
Starting recover at 03-JUL-14
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 14 is already on disk as file /arch/proddb/1_14_848660904.dbf
archived log for thread 1 with sequence 15 is already on disk as file /arch/proddb/1_15_848660904.dbf
archived log for thread 1 with sequence 16 is already on disk as file /arch/proddb/1_16_848660904.dbf
archived log file name=/arch/proddb/1_14_848660904.dbf thread=1 sequence=14
media recovery complete, elapsed time: 00:00:01
Finished recover at 03-JUL-14
repair failure complete
starting full resync of recovery catalog
full resync complete
From the above, we can clearly see that the USERS datafile is recovered.
ASMCMD> ls -l
Type Redund Striped Time Sys Name
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y EXAMPLE.265.848660933
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y SYSAUX.257.848660835
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y SYSTEM.256.848660835
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y TBSRMAN.267.848663633
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y UNDOTBS1.258.848660835
DATAFILE UNPROT COARSE JUL 03 12:00:00 Y USERS.259.851949575
ASMCMD>
RMAN> sql 'alter tablespace USERS online';
sql statement: alter tablespace USERS online
starting full resync of recovery catalog
full resync complete
SQL> select file_name,tablespace_name,status from dba_data_files;
FILE_NAME
TABLESPACE_NAME
STATUS
------------------------------------------------------- -------------------- ---------
+DATA_GRP/proddb/datafile/users.259.851949575
USERS
AVAILABLE
+DATA_GRP/proddb/datafile/undotbs1.258.848660835
UNDOTBS1
AVAILABLE
+DATA_GRP/proddb/datafile/sysaux.257.848660835
SYSAUX
AVAILABLE
+DATA_GRP/proddb/datafile/system.256.848660835
SYSTEM
AVAILABLE
+DATA_GRP/proddb/datafile/example.265.848660933
EXAMPLE
AVAILABLE
+DATA_GRP/proddb/datafile/tbsrman.267.848663633
TBSRMAN
AVAILABLE
6 rows selected.
An important thing to be considered on DRA is, if we are not using the controlfile autobackup option when doing your backups and the recovery of the controlfile is required, the DRA will not be able to do a full automatic recovery process and which leads to a combination of automatic and manual recovery process to recover the failure during this situation.