Sunday, September 20, 2009

Database duplication using Recovery Manager

Introduction
This document provides you with a brief description on how to do refresh a Database (duplicate a database) from the Production Database backup taken using RMAN to tapes to any other environments.

I. Introduction
---------------
This document describes the process of refreshing the Test Databases from the RMAN Production database backups taken to the Tapes / Disks.

II. Initial Preparation Steps
-----------------------------
We first need to make sure that the database is not running within Fail Safe environment, and that the disk space used by the old database is released, so we can fit the new database. Here are a few steps that need to be taken cars of before starting the Database Refresh:

1. If there is requirement to preserve some data or accounts (Schema’s and other necessary things) from the old environment, export that data first before starting the refreshing from Production.

2. If the databases are running in Fail Safe environment, shut them down through Fail Safe Manager. Also shut down the Listener and the Intelligent Agent that is running for those particular databases.

3. Modify the TNSNAMES.ORA, the INIT.ORA (and listener.ora if required) files to make sure the database can be started independently, using the local listener. Try this out by starting the listener service and database service manually through the Services screen on the Win 2000 machine, and starting database through SQL*Plus.

4. Shut down the database using the FAILSAFE Manager and remove all database files except ones from the Admin directories (e.g. init.ora). This is required to clean space on disk to fit the new database. If we have enough disk space for the restore to happen, then we move the existing files to a different directory or mount point.

III. Preparing RMAN Duplication Script
--------------------------------------
Once we are done with the above steps, we can proceed with the next step of creating the scripts for restoring the Database. An example of this script is given below:
connect catalog rman/password@;
connect target sys/password@;
connect auxiliary sys/password@;
run {
allocate auxiliary channel ch1 type 'sbt_tape' parms
'ENV=(TDPO_OPTFILE=c:\cluster\tdpo.opt)';
set until scn or ;

set newname for datafile 1 to ‘new path for restore’;
. . . . . .
. . . . . .
...
duplicate target database to
logfile group 1 (‘’,
(‘’) SIZE 100M,
group 2 ((‘’,
(‘’) SIZE 100M,
group 3 ((‘’,
(‘’) SIZE 100M;
}

The description of the above script is as follows :

The first part deals with connecting to required databases:

1. catalog database where RMAN catalog is stored,
2. target database which is the database we want to clone, and
3. auxiliary database which is the one that we are attempting to create.
4. Note that when running this script later on, both catalog and target databases need to be open during the process, while auxiliary database is normally in NOMOUNT state.

Next in the script is allocating channel used to access file system through TSM. Note that to do that we will need to change TSM configuration (dsm.opt file, nodename parameter) in order for the node to appear as the production node.
Next in the script is set until SCN / DATE command that specifies until which point the database will be duplicated. If the UNTIL SCN / DATE is not mentioned, RMAN will attempt to recover until the last archived log, which can cause failure if that log is not available on the tape drive (e.g. it is still on the production server disk).
Next is the list of set newname for datafile commands, which are required when new disk structure is different from production disk structure (which is case on all our systems). All database files should be specified in this list (nothing is required for tempfiles). The list of datafile’s can be obtained by querying the DBA_DATA_FILES data dictionary view.
Finally, the duplicate command is there to do the actual database duplication.

IV. Running Database Duplication
--------------------------------
To run database duplication we can prepare a batch script, or run a command to start it up. It would look something like this:

rman cmdfile (the rman script) msglog (a file name for the logs)

Before starting the RMAN script, the following things need to be taken care of :
1. Verify that the RMAN catalog database is open. Make sure this database will be open during complete duplication process, e.g. if it normally goes down for backup turn off the backup procedures. If the connection to the database is lost during the duplication, the process will fail and will need recovery.

2. Verify that the target database is open. Make sure this database will be open during complete duplication process, same as for RMAN database.

3. Verify that the Oracle services for auxiliary database are running and the database is in nomount state.

If the RMAN script is successful, it will get all the files from the file system, place them in appropriate locations as specified in the script, and recover the database. It will also change the Database ID, and start the database. This is the best case scenario, however, if duplication script fails you might need to try and recover from failure.

V. Recovering from Failure
--------------------------
If the RMAN duplication process fails, We might need to recover the database using the RMAN backup. The Database supplication or the restore can fail because of some reasons like :

1. RMAN catalog database going down for backup
2. Archived logs not available on the file system (when set until SCN was not specified in the script). In those cases you might try following steps to recover, first run the switch clone command through RMAN (After CONNECTING to the TARGET, CATALOG and AUXILIARY Databases) :
run{
switch clone datafile all;
}
Afterwards, try recreating the control file. RMAN first creates a control file but does not have all data files specified in there (it creates that one later). Best way is to backup control file to trace on the target database, and modify that script to run in auxiliary database. Changes to the script are typically: use new filenames as the location might have changed, set new database name, and use RESETLOGS clause.
Once control file is created and executed, complete recovery of the database until specified SCN, the RMAN script can look something like this :
run {
allocate auxiliary channel ch1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=c:\cluster\tdpo.opt)';
set until scn 6899135273;
recover
clone database
check readonly;
release channel ch1;
}
This step will obtain all required archived logs from the file system and apply them to the database. After recovery is completed you can open the database:
alter database open resetlogs;
That would complete the recovery. Note that when recovered this way, the database Id is still the same as for production, thus you cannot use RMAN to backup that database (unless you are using different catalog). Consequently, one should always strive to have the database duplicated properly through RMAN without failures.

VI. Post Refresh Steps
----------------------
After the database is duplicated, there are few steps that might be required:
1. In some environments , it may be required to change the Mode to noarchivelog mode as the Production is mainly run in Archivelog mode.

2. Add files to temporary tablespaces. When the database is restored all files and tablespaces will exist, however, none of the temp files will be created. One needs to add tempfiles to temporary tablespaces.

3. Drop all database links, and recreate them to point to proper environment. After duplication, new database will have same database links as the production, thus pointing to the production database. All the database links should therefore be dropped, and new ones created to point to the new environment.

4. If the new database is running in the Fail Safe environment, one will need to rebuild the password file on the other node (the one that was not used when duplicating the database). If this is not done, the database will not start on that node and the whole Fail Safe group will be moved to other node.

5. Revert back changes to tnsnames.ora (and listener.ora if applicable) to make sure the database can start within Fail Safe.

6. Revert back changes done to the TSM configuration files (dsm.opt).

7. Shut the database, stop local listener and database services, and start the listener and database within Fail Safe.

8. Make sure the database can fall-over correctly to another node, by moving Fail Safe group manually.

No comments: