Pages

Thursday, July 3, 2014

Restoring With RMAN and Using The DATABASE RECOVERY ADVISOR (DRA)


The Database Recovery Advisor (DRA) is a new feature introduced in Oracle 11g to help us to diagnose any persistent failure with the database.
The main advantage of this feature is, it helps us to fix the issue taking away some of the stress always associated with this type of operation.

Database Recovery Advisor is useful only if we use RMAN for our backups.

The main benefits of this feature DRA is below.,

1. It simplifies the diagnosis, analysis and recovery steps we perform.
2. It allows us to easily detect and check the failures happened. 
3. It reduces the recovery time.
4. Reduces Human Errors.
5. It provides us with suggestions, steps and advices to repair the failure occur.
6. We can generate a repair script and recover the failure using the generated script on requirement.

Database Recovery Advisor can be used via OEM or through RMAN command line interface. The below are some of the commands used in RMAN command line interface.

1. LIST FAILURE : List all failures that the database encountered.

2. ADVISE FAILURE : Analyzes all the backups and provide a recovery script to repair the failure occur.

3. CHANGE FAILURE : This command changes the priority or status of the failures.

4. REPAIR FAILURE : Repairs the failure occured with the automatically generated script (which is previously generated by the ADVISE FAILURE command).

As I mentioned earlier, Database Recovery Advisor is useful only if we use RMAN for our backups. I have taken full backups before working on the scenarios.

SCENARIO - Recovery of DATAFILE:

RMAN> report need backup;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of files with less than 1 redundant backups
File #bkps Name
---- ----- -----------------------------------------------------



As I have taken the backups before, the RMAN requires no further backup is required.
I have taken the USERS tablespace offline and removed it's datafile as below.,

RMAN> sql 'alter tablespace users offline';

sql statement: alter tablespace users offline
starting full resync of recovery catalog
full resync complete

[oracle@localhost ~]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     10236     7784                0            7784              0             N  DATA_GRP/
MOUNTED  EXTERN  N         512   4096  1048576     10236     9855                0            9855              0             N  FRA_GRP/

ASMCMD> cd DATA_GRP/PRODDB/DATAFILE

ASMCMD> ls

EXAMPLE.265.848660933
SYSAUX.257.848660835
SYSTEM.256.848660835
TBSRMAN.267.848663633
UNDOTBS1.258.848660835
USERS.259.848660835

ASMCMD> rm USERS.259.848660835

ASMCMD> ls

EXAMPLE.265.848660933
SYSAUX.257.848660835
SYSTEM.256.848660835
TBSRMAN.267.848663633
UNDOTBS1.258.848660835

From the above list we can clearly see that the USERS datafile has been removed.

Now we are going to recover this datafile using the DATABASE RECOVERY ADVISOR.

RMAN> list failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
594        HIGH     OPEN      03-JUL-14     Tablespace 4: 'USERS' is offline
588        HIGH     OPEN      03-JUL-14     One or more non-system datafiles are missing
582        HIGH     OPEN      03-JUL-14     One or more non-system datafiles are offline

We can see that the LIST FAILURE command lists all the failures detected by the DRA. We can see the details of the failure by issuing the command LIST FAILURE <failure_id> DETAIL as below,

RMAN> list failure 588 detail;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
588        HIGH     OPEN      03-JUL-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 588
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  591        HIGH     OPEN      03-JUL-14     Datafile 4: '+DATA_GRP/proddb/datafile/users.259.848660835' is missing
    Impact: Some objects in tablespace USERS might be unavailable



Now we will issue the ADVISE FAILURE command, so that the DRA suggests / advices to fix all the current failures.

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
588        HIGH     OPEN      03-JUL-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 588
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  591        HIGH     OPEN      03-JUL-14     Datafile 4: '+DATA_GRP/proddb/datafile/users.259.848660835' is missing
    Impact: Some objects in tablespace USERS might be unavailable

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=38 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If file +DATA_GRP/proddb/datafile/users.259.848660835 was unintentionally renamed or moved, restore it

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 4
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm


From the above details, we can clearly see that the ADVISE FAILURE automatically command generated a repair script and stored it in the location /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm

If needed we can use the command REPAIR FAILURE PREVIEW to see the details of the script generated by the ADVISE FAILURE command before we order the DRA to repair the failures.

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 4;
   recover datafile 4;


Now issue the REPAIR FAILURE command to repair the failures.

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/proddb/proddb/hm/reco_8097687.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 4;
   recover datafile 4;

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 03-JUL-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to +DATA_GRP/proddb/datafile/users.259.848660835
channel ORA_DISK_1: reading from backup piece /rman/bkup_0kpbok52_1_1
channel ORA_DISK_1: piece handle=/rman/bkup_0kpbok52_1_1 tag=TAG20140625T210329
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 03-JUL-14

Starting recover at 03-JUL-14
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 14 is already on disk as file /arch/proddb/1_14_848660904.dbf
archived log for thread 1 with sequence 15 is already on disk as file /arch/proddb/1_15_848660904.dbf
archived log for thread 1 with sequence 16 is already on disk as file /arch/proddb/1_16_848660904.dbf
archived log file name=/arch/proddb/1_14_848660904.dbf thread=1 sequence=14
media recovery complete, elapsed time: 00:00:01
Finished recover at 03-JUL-14
repair failure complete
starting full resync of recovery catalog
full resync complete


From the above, we can clearly see that the USERS datafile is recovered.

ASMCMD> ls -l
Type      Redund  Striped  Time             Sys  Name
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    EXAMPLE.265.848660933
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    SYSAUX.257.848660835
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    SYSTEM.256.848660835
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    TBSRMAN.267.848663633
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    UNDOTBS1.258.848660835
DATAFILE  UNPROT  COARSE   JUL 03 12:00:00  Y    USERS.259.851949575
ASMCMD>

RMAN> sql 'alter tablespace USERS online';

sql statement: alter tablespace USERS online
starting full resync of recovery catalog
full resync complete



SQL> select file_name,tablespace_name,status from dba_data_files;

FILE_NAME TABLESPACE_NAME  STATUS
------------------------------------------------------- -------------------- ---------
+DATA_GRP/proddb/datafile/users.259.851949575 USERS     AVAILABLE
+DATA_GRP/proddb/datafile/undotbs1.258.848660835 UNDOTBS1     AVAILABLE
+DATA_GRP/proddb/datafile/sysaux.257.848660835 SYSAUX     AVAILABLE
+DATA_GRP/proddb/datafile/system.256.848660835 SYSTEM     AVAILABLE
+DATA_GRP/proddb/datafile/example.265.848660933 EXAMPLE     AVAILABLE
+DATA_GRP/proddb/datafile/tbsrman.267.848663633 TBSRMAN     AVAILABLE

6 rows selected.


An important thing to be considered on DRA is, if we are not using the controlfile autobackup option when doing your backups and the recovery of the controlfile is required, the DRA will not be able to do a full automatic recovery process and which leads to a combination of automatic and manual recovery process to recover the failure during this situation.

No comments:

Post a Comment