Thursday, August 30, 2007

Recovering from media failure in Oracle 10g

Recovering from media failure in Oracle 10g

Learning objective

After completing this topic, you should be able to recover from the loss of control files, redo log files, and datafiles.

1. Recovering from loss of a control file

The Oracle database has three copies of its control files by default, and you should maintain this number, or a minimum of two, at all times.

If a control file is lost, for example due to media failure, you should shut down the database using the SHUTDOWN ABORT command.

When the database has been shut down, you can restore the missing control file by copying one of the remaining control files to the expected location of the missing file.

If you can't create a new copy in the old location, perhaps because that disk has failed, then you should create a copy in a new location, and update the initialization parameter file with the new address.

Note

You could just delete the reference to the lost control file from the initialization parameter file, and restart the database. This is not recommended, however, because of the danger that the remaining control files will be lost.

Suppose that you receive user complaints saying that the database is down. You try to restart the database in Enterprise Manager, but the attempt fails. You then open SQL*Plus to investigate the problem.

First, you check the instance status, with the command

SELECT status FROM v$instance;

The status returned is STARTED, so you know that the instance is in the NOMOUNT stage.

You then attempt to mount the database, with the command

ALTER DATABASE MOUNT;

but the command fails. It returns an error that says a control file can't be identified, and advises you to check the alert log for more details.

The alert log tells you which of the control files is missing. You can then restore the missing control file by copying one of the remaining files. The database can then be reopened.

Question

Your system had two control files but one has been lost. Which of the following are recommended practices for recovering from control file loss?

Options:

  1. Change the CONTROL_FILES parameter so that it doesn't point at the missing file
  2. Check the alert logs
  3. Check instance status in V$INSTANCE view
  4. Make a copy of the remaining control file

Answer

You should check the instance status, read the alert log, and then make a copy of the remaining control file.

Option 1 is incorrect. If you only had the very minimum of two control files to start with, you shouldn't simply change the CONTROL_FILES parameter when a file is lost.

Option 2 is correct. The alert logs tell you which copy of the control file has been lost.

Option 3 is correct. The database is returned to the NOMOUNT state when a control file is lost. The instance status tells you what state the database is in.

Option 4 is correct. You should always have three copies of the control file, or at the very least two, in case one is lost. If you only have two and one is lost, you should immediately copy the other so that you have two copies again.

2. Recovering from loss of redo log files

The redo log files are organized into redo log groups. If one of these files is lost, a message will be sent to the alert log, though the instance can continue as long as at least one member of the redo log group is available.

If you lose a redo log file you should restore it as soon as possible in case the other group members are lost.

Suppose you see a message in the alert log telling you that a redo log file is missing.

ORA-00313: open failed for members of log group 1 of thread 1


ORA-00312: online log 1 thread 1:


'/oracle/oradata/orcl/redo01.log' ORA-27037: unable to obtain


file status Linux Error: 2: No such file or directory

You can restore the missing file by copying one of the remaining log files in the command line.

$ cp < remaining log file>< missing log file>

Question

Which of these are characteristics of redo log files?

Options:

  1. Redo log files can only be restored when the database is in MOUNT state
  2. The instance can continue with one member of each redo log group
  3. The instance will shut down if a redo log file is lost
  4. You can replace log files by copying existing files

Answer

The instance can continue with one member of each redo log group, and you can replace redo log files by copying existing files.

Option 1 is incorrect. You can restore redo log files while the database is open.

Option 2 is correct. You must have at least one member of each redo log group to continue. You should make sure that there are at least two members, however, to provide redundancy.

Option 3 is incorrect. If a redo log file is lost it will be recorded in the alert log. The instance will only shut down if all of the files in a redo log group are lost.

Option 4 is correct. If you lose a redo log file you can replace it by copying one of the other log files from the group.

3. Recovering from the loss of a datafile

A media failure could lead to the database losing datafiles. The steps you take to restore the datafile will depend on the tablespace the datafile belongs to, and the database mode:

  • NOARCHIVELOG mode
  • ARCHIVELOG mode
  • datafile from SYSTEM or UNDO tablespaces
NOARCHIVELOG mode
If the database is running in NOARCHIVELOG mode, you must perform a full recovery of the database to restore any datafile. This recovery will only restore the database up to the last backup.
ARCHIVELOG mode
If the database is in ARCHIVELOG mode you can recover a single datafile as long as it is not from the SYSTEM or UNDO tablespaces, leaving the rest of the database open for users to work with. The restored datafile will contain all changes made up to the last committed change.
datafile from SYSTEM or UNDO tablespaces
The datafiles in the SYSTEM and UNDO tablespaces are system-critical. You must shut down the database instance and place the database in the MOUNT state while these datafiles are being recovered.

Suppose you receive a message from a user saying that they can't access data from the EMPLOYEES table. The database is in NOARCHIVELOG mode. You check the table to see if there is a problem, with the command

SELECT count (*) FROM hr.employees

This tells you that one of the datafiles can't be found.

You open Enterprise Manager to try to recover the missing file.

You click the Maintenance tab.

You then select the Perform Recovery link.

Because the database is in NOARCHIVELOG mode, you must perform a complete recovery. The Whole Database option is automatically selected in the Object Type drop-down list, and you click Next to continue.

This will shut down the instance before directing you to a recovery wizard that you use to restore the whole database. The database will be recovered to the time of the last backup, so any changes made after that time must be re-entered.

Suppose the database was in ARCHIVELOG mode. After discovering the datafile that was missing, you could choose to recover only that file.

You open Enterprise Manager, and select the Perform Recovery link on the Maintenance tabbed page as before.

In this case you are only recovering a single file, so you accept the default Datafiles setting in the Object Type drop-down list.

By default, the datafile will be restored to the current time, though you can choose to restore it to a previous state. You accept the default option Recover to current time or a previous point-in-time.

You must enter your credentials, to prove that you are authorized to perform the recovery, and then click Next.

You then select the datafiles to be recovered.

You click Add to display the list of datafiles.

You click the Select checkbox for the users01.dbf file.

You click Next to accept the selected file.

And then click Next again to proceed.

You click Submit to perform the recovery.

If a datafile from the SYSTEM or UNDO tablespaces is lost, the database will shut down. Any attempts to restart will fail, and the database will be left in the MOUNT state.

You click Perform Recovery and follow the same sequence of steps as before.

When the recovery has succeeded, you can restart the database.

Question

How does recovering a SYSTEM datafile or a datafile belonging to an UNDO tablespace differ from recovering other datafiles?

Options:

  1. You can replace SYSTEM and UNDO tablespace files by copying remaining files
  2. You can only recover SYSTEM and UNDO datafiles if the database is in ARCHIVELOG mode
  3. SYSTEM or UNDO datafiles must be recovered from the MOUNT state

Answer

If any datafiles from the SYSTEM or UNDO tablespaces are lost, the database shuts down. The files must be recovered while the database is in MOUNT mode.

Option 1 is incorrect. You must recover the datafiles from backups with the database shut down.

Option 2 is incorrect. These datafiles must be recovered while the database is shut down, but can be recovered whether the database was in ARCHIVELOG or NOARCHIVELOG mode. With the database in NOARCHIVELOG mode, recovery is possible only up to the time of the last backup.

Option 3 is correct. The instance shuts down immediately if datafiles are lost from the SYSTEM or UNDO tablespaces. You must place the instance in MOUNT mode to recover the lost files.

Question

Your database is running in NOARCHIVELOG mode when a media failure causes you to lose the employees01.dbf datafile.

Select the option that recovers the lost file.

Options:

  1. Click the Maintenance tab, and select Perform Recovery. Then click Next.
  2. Click the Maintenance tab, and select Perform Recovery. Select Datafiles from the Recovery Type drop down list, and click Next, then select the employees01.dbf, click Next again, and click Submit.

Answer

You click the Maintenance link, and select Perform Recovery. Then you click Next.

Summary

The Oracle database has three copies of its control files by default, and you should maintain this number, or a minimum of two, at all times. When a control file is lost, the database instance stops and the database goes to the NOMOUNT state. You can reopen the database once you have restored the missing control file from a surviving copy.

Redo log files are organized into redo log file groups. There must be at least one member of each group available in the database, otherwise the instance will stop. A message is sent to the alert log whenever a redo log file is lost. When this happens, you should create a copy from one of the remaining members of the group.

If datafiles are lost with the database in NOARCHIVELOG mode, you must restore the entire database to recover them. In ARCHIVELOG mode you can recover just the missing datafiles. If any SYSTEM or UNDO tablespace datafiles are lost, the instance will shut down.

No comments: