Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Subject: TECH: Internals of Recovery
- Type: REFERENCE
- Creation Date: 13-SEP-1996
- Oracle7 v7.2 Recovery Outline
- Authors: Andrea Borr & Bill Bridge
- Version: 1 May 3, 1995
- Abstract
- This document gives an overview of how database recovery works
- in Oracle7 version 7.2. It is assumed that the reader is familiar
- with the Database Administrator's Guide for Oracle7 version 7.2.
- The intention of this document is to describe the recovery
- algorithms and data structures, providing more details than the
- Administrator's Guide.
- Table of Contents
- 1 Introduction
- 1.1 Instance Recovery and Media Recovery: Common Mechanisms
- 1.2 Instance Failure and Recovery, Crash Failure and Recovery
- 1.3 Media Failure and Recovery
- 2 Fundamental Data Structures
- 2.1 Controlfile
- 2.1.1 Database Info Record (resetControlfile)
- 2.1.2 Datafile Record (Controlfile)
- 2.1.3 Thread Record (Controlfile)
- 2.1.4 Logfile Record (Controlfile)
- 2.1.5 Filename Record (Controlfile)
- 2.1.6 Log-History Record (Controlfile)
- 2.2 Datafile Header
- 2.3 Logfile Header
- 2.4 Change Vector
- 2.5 Redo Record
- 2.6 System Change Number (SCN)
- 2.7 Redo Logs
- 2.8 Thread of Redo
- 2.9 Redo Byte Address (RBA)
- 2.10 Checkpoint Structure
- 2.11 Log History
- 2.12 Thread Checkpoint Structure
- 2.13 Database Checkpoint Structure
- 2.14 Datafile Checkpoint Structure
- 2.15 Stop SCN
- 2.16 Checkpoint Counter
- 2.17 Tablespace-Clean-Stop SCN
- 2.18 Datafile Offline Range
- 3 Redo Generation
- 3.1 Atomic Changes
- 3.2 Write-Ahead Log
- 3.3 Transaction Commit
- 3.4 Thread Checkpoint
- 3.5 Online-Fuzzy Bit
- 3.6 Datafile Checkpoint
- 3.7 Log Switch
- 3.8 Archiving Log Switches
- 3.9 Thread Open
- 3.10 Thread Close
- 3.11 Thread Enable
- 3.12 Thread Disable
- 4 Hot Backup
- 4.1 BEGIN BACKUP
- 4.2 File Copy
- 4.3 END BACKUP
- 4.4 "Crashed" Hot Backup
- 5 Instance Recovery
- 5.1 Detection of the Need for Instance Recovery
- 5.2 Thread-at-a-Time Redo Application
- 5.3 Current Online Datafiles Only
- 5.4 Checkpoints
- 5.5 Crash Recovery Completion
- 6 Media Recovery
- 6.1 When to Do Media Recovery
- 6.2 Thread-Merged Redo Application
- 6.3 Restoring Backups
- 6.4 Media Recovery Commands
- 6.4.1 RECOVER DATABASE
- 6.4.2 RECOVER TABLESPACE
- 6.4.3 RECOVER DATAFILE
- 6.5 Starting Media Recovery
- 6.6 Applying Redo, Media Recovery Checkpoints
- 6.7 Media Recovery and Fuzzy Bits
- 6.7.1 Media-Recovery-Fuzzy
- 6.7.2 Online-Fuzzy
- 6.7.3 Hotbackup-Fuzzy
- 6.8 Thread Enables
- 6.9 Thread Disables
- 6.10 Ending Media Recovery (Case of Complete Media Recovery)
- 6.11 Automatic Recovery
- 6.12 Incomplete Recovery
- 6.12.1 Incomplete Recovery UNTIL Options
- 6.12.2 Incomplete Recovery and Consistency
- 6.12.3 Incomplete Recovery and Datafiles Known to the
- Controlfile
- 6.12.4 Resetlogs Open after Incomplete Recovery
- 6.12.5 Files Offline during Incomplete Recovery
- 6.13 Backup Controlfile Recovery
- 6.14 CREATE DATAFILE: Recover a Datafile Without a Backup
- 6.15 Point-in-Time Recovery Using Export/Import
- 7 Block Recovery
- 7.1 Block Recovery Initiation and Operation
- 7.2 Buffer Header RBA Fields
- 7.3 PMON vs. Foreground Invocation
- 8 Resetlogs
- 8.1 Fuzzy Files
- 8.2 Resetlogs SCN and Counter
- 8.3 Effect of Resetlogs on Threads
- 8.4 Effect of Resetlogs on Redo Logs
- 8.5 Effect of Resetlogs on Online Datafiles
- 8.6 Effect of Resetlogs on Offline Datafiles
- 8.7 Checking Dictionary vs. Controlfile on Resetlogs Open
- 9 Recovery-Related V$ Fixed-Views
- 9.1 V$LOG
- 9.2 V$LOGFILE
- 9.3 V$LOG_HISTORY
- 9.4 V$RECOVERY_LOG
- 9.5 V$RECOVER_FILE
- 9.6 V$BACKUP
- 10 Miscellaneous Recovery Features
- 10.1 Parallel Recovery (v7.1)
- 10.1.1 Parallel Recovery Architecture
- 10.1.2 Parallel Recovery System Initialization Parameters
- 10.1.3 Media Recovery Command Syntax Changes
- 10.2 Redo Log Checksums (v7.2)
- 10.3 Clear Logfile (v7.2)
- 1 Introduction
- The Oracle RDBMS provides database recovery facilities capable
- of preserving database integrity in the face of two major failure
- modes:
- 1. Instance failure: loss of the contents of a buffer cache, or data
- residing in memory.
- 2. Media failure: loss of database file storage on disk.
- Each of these two major failure modes raises its own set of
- challenges for database integrity. For each, there is a set of
- requirements that a recovery utility addressing that failure mode
- must satisfy.
- Although recovery processing for the two failure modes has much
- in common, the requirements differ enough to motivate the
- implementation of two different recovery facilities:
- 1. Instance recovery: recovers data lost from the buffer cache
- due to instance failure.
- 2. Media recovery: recovers data lost from disk storage.
- 1.1 Instance Recovery and Media Recovery: Common Mechanisms
- Both instance recovery and media recovery depend for their
- operation on the redo log. The redo log is organized into redo
- threads, referred to hereafter simply as threads. The redo log of a
- single-instance (non-Parallel Server option) database consists of a
- single thread. A Parallel Server redo log has a thread per instance.
- A redo log thread is a set of operating system files in which an
- instance records all changes it makes - committed and
- uncommitted - to memory buffers containing datafile blocks.
- Since this includes changes made to rollback segment blocks, it
- follows that rollback data is also (indirectly) recorded in the redo
- log.
- The first phase of both instance and media recovery processing is
- roll-forward. Roll-forward is the task of the RDBMS recovery
- layer. During roll-forward, changes recorded in the redo log are re-
- applied (as needed) to the datafiles. Because changes to rollback
- segment blocks are recorded in the redo log, roll-forward also
- regenerates the corresponding rollback data. When the recovery
- layer finishes its task, all changes recorded in the redo log have
- been restored by roll-forward. At this point, the datafile blocks
- contain not only all committed changes, but also any uncommitted
- changes recorded in the redo log.
- The second phase of both instance and media recovery processing
- is roll-back. Roll-back is the task of the RDBMS transaction layer.
- During roll-back, undo information from rollback segments (as
- well as from save-undo/deferred rollback segments, if appropriate)
- is used to undo uncommitted changes that were applied during the
- roll-forward phase.
- 1.2 Instance Failure and Recovery, Crash Failure and Recovery
- Instance failure, a failure resulting in the loss of the instance's
- buffer cache, occurs when an instance is aborted, either
- unexpectedly or expectedly. Examples of reasons for unexpected
- instance aborts are operating system crash, power failure, or
- background process failure. Examples of reasons for expected
- instance aborts are use of the commands SHUTDOWN ABORT
- and STARTUP FORCE.
- Crash failure is the failure of all instances accessing a database. In
- the case of a single-instance (non-Parallel Server option) database,
- the terms crash failure and instance failure are used
- interchangeably. Crash recovery (equivalent to instance recovery in
- this case) is the process of recovering all online datafiles to a
- consistent state following a crash. This is done automatically in
- response to the ALTER DATABASE OPEN command.
- In the case of the Parallel Server option, the term crash failure is
- used to refer to the simultaneous failures of all open instances.
- Parallel Server crash recovery is the process of recovering all
- online datafiles to a consistent state after all instances accessing the
- database have failed. This is done automatically in response to the
- ALTER DATABASE OPEN command. Parallel Server instance
- failure refers to the failure of an instance while a surviving instance
- continues in operation. Parallel Server instance recovery is the
- automatic recovery by a surviving instance of a failed instance.
- Instance failure impairs database integrity because it results in loss
- of the instance's dirty buffer cache. A "dirty" buffer is one whose
- memory version differs from its disk version. An instance that
- aborts has no opportunity for writing out "dirty" buffers so as to
- prevent database integrity breakage on disk following a crash. Loss
- of the dirty buffer cache is a problem due to the fact that the cache
- manager uses algorithms optimized for OLTP performance rather
- than for crash-tolerance. Examples of performance-optimizing
- cache management algorithms that make the task of instance
- recovery more difficult are as follows:
- 7 LRU (least recently used) based buffer replacement
- 7 no-datablock-force-at-commit (see 3.3).
- As a consequence of the performance-oriented cache management
- algorithms, instance failure can cause database integrity breakage
- as follows:
- A. At crash time, the datafiles on disk might contain some but not
- all of a set of datablock changes that constitute a single atomic
- change to the database with respect to structural integrity
- (see 2.5).
- B. At crash time, the datafiles on disk might contain some dat-
- ablocks modified by uncommitted transactions.
- C. At crash time, the datafiles on disk might contain some dat-
- ablocks missing changes from committed transactions.
- During instance recovery, the RDBMS recovery layer repairs
- database integrity breakages A and C. It also enables subsequent
- repair - by the RDBMS transaction layer - of database integrity
- breakage B.
- In addition to the requirement that it repair any integrity breakages
- resulting from the crash, instance recovery must meet the following
- requirements:
- 1. Instance recovery must accomplish the repair using the current
- online datafiles (as left on disk after the crash).
- 2. Instance Recovery must use only the on-line redo logs. It must
- not require use of the archived logs. Although instance recov-
- ery could work successfully from archived logs (except for a
- database running in NOARCHIVELOG mode), it could not
- work autonomously (requirement 4) if an operator were
- required to restore archived logs.
- 3. The invocation of instance recovery must be automatic,
- implicit at the next database startup.
- 4. Detection of the need for repair and the repair itself must pro-
- ceed autonomously, without operator intervention.
- 5. The duration of the roll-forward phase of instance recovery is
- governed by both RDBMS internal mechanisms (checkpoint)
- and user-configurable parameters (e.g. number and sizes of
- logfiles, checkpoint-frequency tuning parameters, parallel
- recovery parameters).
- As seen above, Oracle's buffer cache component is optimized for
- OLTP performance rather than for crash-tolerance. This document
- describes some of the mechanisms used by the cache and recovery
- components to solve the problems posed by use of performance-
- optimizing cache algorithms such as LRU buffer replacement and
- no-datablock-force-at-commit. These mechanisms enable instance
- recovery to meet its requirements while allowing optimal OLTP
- performance. These mechanisms include:
- 7 Log-Force-at-Commit: see 3.3.
- Facilitates repair of breakage type C by guaranteeing that, at
- transaction commit time, all of the transaction's redo records,
- including its "commit record," are stored on disk in the on-line
- redo log.
- 7 Checkpointing: see 3.4, 3.6.
- Bounds the amount of transaction redo that instance recovery
- must potentially apply.
- Works in conjunction with online-log switch management to
- ensure that instance recovery can be accomplished using only
- online logs and current online datafiles.
- 7 Online-Log Switch Management: see 3.7.
- Works in conjunction with checkpointing to ensure that
- instance recovery can be accomplished using only online logs
- and current online datafiles. It guarantees that the current
- checkpoint is beyond an online logfile before that logfile is
- reused.
- 7 Write-Ahead-Log: see 3.2.
- Facilitates repair of breakage types A and B by guaranteeing
- that: (i) at crash time there are no changes in the datafiles that
- are not in the redo log; (ii) no datablock change was written to
- disk without first writing to the log sufficient information to
- enable undo of the change should a crash intervene before
- commit.
- 7 Atomic Redo Record Generation: see 3.1.
- Facilitates repair of breakage types A and B.
- 7 Thread-Open Flag: 5.1.
- Enables detection at startup time of the need for crash recov-
- ery.
- 1.3 Media Failure and Recovery
- Instance failure affects logical database integrity. Because instance
- failure leaves a recoverable version of the online datafiles on the
- post-crash disk, instance recovery can use the online datafiles as a
- starting point.
- Media failure, on the other hand, affects physical storage media
- integrity or accessibility. Because the original datafile copies are
- damaged, media recovery uses restored backup copies of the
- datafiles as a starting point. Media recovery then uses the redo log
- to roll-forward these files, either to a consistent present state or to a
- consistent past state. Media recovery is run by issuing one of the
- following commands: RECOVER DATABASE, RECOVER
- TABLESPACE, RECOVER DATAFILE.
- Depending on the failure scenario, a media failure has the potential
- for causing database integrity breakages similar to those caused by
- an instance failure. For example, an integrity breakage of type A,
- B, or C could result if I/O accessibility to a datablock were lost
- between the time the block was read into the buffer cache and the
- time DBWR attempted to write out an updated version of the
- block. More typical, however, is the case of a media failure that
- results in the permanent loss of the current version of a datafile, and
- hence of all updates to that datafile that occurred since the last time
- the file was backed up.
- Before media recovery is invoked, backup copies of the damaged
- datafiles are restored. Media recovery then applies relevant
- portions of the redo log to roll-forward the datafile backups,
- making them current. Current implies a pre-failure state consistent
- with the rest of the database
- Media recovery and instance recovery have in common the
- requirement to repair database integrity breakages A-C. However,
- media recovery and instance recovery differ with respect to
- requirements 1-5. The requirements for media recovery are as
- follows:
- 1. Media recovery must accomplish the repair using restored
- backups of damaged datafiles.
- 2. Media recovery can use archived logs as well as the online
- logs.
- 3. Invocation of media recovery is explicit, by operator com-
- mand.
- 4. Detection of media failure (i.e. the need to restore a backup) is
- not automatic.Once a backup has been restored however,
- detection of the need to recover it via media recovery is auto-
- matic.
- 5. The duration of the roll-forward phase of media recovery is
- governed solely by user policy
- (e.g. frequency of backups, parallel recovery parameters)
- rather than by RDBMS internal mechanisms.
- 2 Fundamental Data Structures
- 2.1 Controlfile
- The controlfile contains records that describe and keep state
- information about all the other files of the database.
- The controlfile contains the following categories of records:
- 7 Database Info Record (1)
- 7 Datafile Records (1 per datafile)
- 7 Thread Records (1 per thread)
- 7 Logfile Records (1 per logfile)
- 7 Filename Records (1 per datafile or logfile group member)
- 7 Log-History Records (1 per completed logfile)
- Fields of the controlfile records referenced in the remainder of this
- document are listed below, together with the number(s) of the
- section(s) describing their use:
- 2.1.1 Database Info Record (Controlfile)
- 7 resetlogs timestamp: 8.2
- 7 resetlogs SCN: 8.2
- 7 enabled thread bitvec: 8.3
- 7 force archiving SCN: 3.8
- 7 database checkpoint thread (thread record index): 2.13, 3.10
- 2.1.2 Datafile Record (Controlfile)
- 7 checkpoint SCN: 2.14, 3.4
- 7 checkpoint counter: 2.16, 5.3, 6.2
- 7 stop SCN: 2.15, 6.5, 6.10, 6.13
- 7 offline range (offline-start SCN, offline-end checkpoint): 2.18
- 7 online flag
- 7 read-enabled, write-enabled flags (1-1: read/write, 1-0: read-
- only)
- 7 filename record index
- 2.1.3 Thread Record (Controlfile)
- 7 thread checkpoint structure: 2.12, 3.4, 8.3
- 7 thread-open flag: 3.9, 3.11, 8.3
- 7 current log (logfile record index)
- 7 head and tail (logfile record indices) of list of logfiles in
- thread: 2.8
- 2.1.4 Logfile Record (Controlfile)
- 7 log sequence number: 2.7
- 7 thread number: 8.4
- 7 next and previous (logfile record indices) of list of logfiles in
- thread: 2.8
- 7 count of files in group: 2.8
- 7 low SCN: 2.7
- 7 next SCN: 2.7
- 7 head and tail (filename record indices) of list of filenames in
- group: 2.8
- 7 "being cleared" flag: 10.3
- 7 "archiving not needed" flag: 10.3
- 2.1.5 Filename Record (Controlfile)
- 7 filename
- 7 filetype
- 7 next and previous (filename record indices) of list of filenames
- in group: 2.8
- 2.1.6 Log-History Record (Controlfile)
- 7 thread number: 2.11
- 7 log sequence number: 2.11
- 7 low SCN: 2.11
- 7 low SCN timestamp: 2.11
- 7 next SCN: 2.11
- 2.2 Datafile Header
- Fields of the datafile header referenced in the remainder of this
- document are listed below, together with the number(s) of the
- section(s) describing their use:
- 7 datafile checkpoint structure: 2.14
- 7 backup checkpoint structure: 4.1
- 7 checkpoint counter: 2.16, 3.4, 5.3, 6.2
- 7 resetlogs timestamp: 8.2
- 7 resetlogs SCN: 8.2
- 7 creation SCN: 8.1
- 7 online-fuzzy bit: 3.5, 6.7.1, 8.1
- 7 hotbackup-fuzzy bit: 4.1, 4.4, 6.7.1, 8.1
- 7 media-recovery-fuzzy bit: 6.7.1, 8.1
- 2.3 Logfile Header
- Fields of the logfile header referenced in the remainder of this
- document are listed below, together with the number(s) of the
- section(s) describing their use:
- 7 thread number: 2.7
- 7 sequence number: 2.7
- 7 low SCN: 2.7
- 7 next SCN: 2.7
- 7 end-of-thread flag: 6.10
- 7 resetlogs timestamp: 8.2
- 7 resetlogs SCN: 8.2
- 2.4 Change Vector
- A change vector describes a single change to a single datablock. It
- has a header that gives the Data Block Address(DBA) of the block,
- the incarnation number, the sequence number, and the operation.
- After the header is information that depends on the operation. The
- incarnation number and sequence number are copied from the
- block header when the change vector is constructed. When a block
- is made "new," the incarnation number is set to a value that is
- greater than its previous incarnation number and the sequence
- number is set to one. The sequence number on the block is
- incremented after every change is applied.
- 2.5 Redo Record
- A redo record is a group of change vectors describing a single
- atomic change to the database. For example, a transaction's first
- redo record might group a change vector for the transaction table
- (rollback segment header), a change vector for the undo block
- (rollback segment), and a change vector for the datablock. A
- transaction can generate multiple redo records. The grouping of
- change vectors into a redo record allows multiple database blocks
- to be changed so that either all changes occur or no changes occur,
- despite arbitrary intervening failures. This atomicity guarantee is
- one of the fundamental jobs of the cache layer. Recovery preserves
- redo record atomicity across failures.
- 2.6 System Change Number (SCN)
- An SCN defines a committed version of the database. A query
- reports the contents of the database as it looked at some specific
- SCN. An SCN is allocated and saved in the header of a redo record
- that commits a transaction. An SCN may also be saved in a record
- when it is necessary to mark the redo as being allocated after a
- specific SCN. SCN's are also allocated and stored in other data
- structures such as the controlfile or datafile headers. An SCN is at
- least 48 bits long. Thus they can be allocated at a rate of 16,384
- SCN's per second for over 534 years without running out of them.
- We will run out of SCN's in June, 2522 AD (we use 31 day months
- for time stamps).
- 2.7 Redo Logs
- All changes to database blocks are made by constructing a redo
- record for the change, saving this record in a redo log, then
- applying the change vectors to the datablocks. Recovery is the
- process of applying redo to old versions of datablocks to make
- them current. This is necessary when the current version has been
- lost.
- When a redo log becomes full it is closed and a log switch occurs.
- Each log is identified by its thread number (see below), sequence
- number (within thread), and the range of SCN's spanned by its redo
- records. This information is stored in the thread number, sequence
- number, low SCN, and next SCN fields of the logfile header.
- The redo records in a log are ordered by SCN. Moreover, redo
- records containing change vectors for a given block occur in
- increasing SCN order across threads (case of Parallel Server). Only
- some records have SCN's in their header, but every record is
- applied after the allocation of the SCN appearing with or before it
- in the log. The header of the log contains the low SCN and the next
- SCN. The low SCN is the SCN associated with the first redo record
- (unless there is an SCN in its header). The next SCN is the low
- SCN of the log with the next higher sequence number for the same
- thread. The current log of an enabled thread has an infinite next
- SCN, since there is no log with a higher sequence number.
- 2.8 Thread of Redo
- The redo generated by an instance - by each instance in the
- Parallel Server case - is called a thread of redo. A thread is
- comprised of an online portion and (in ARCHIVELOG mode) an
- archived portion. The online portion of a thread is comprised of
- two or more online logfile groups. Each group is comprised of one
- or more replicated members. The set of members in a group is
- referred to variously as a logfile group, group, redo log, online log,
- or simply log. A redo log contains only redo generated by one
- thread. Log sequence numbers are independently allocated for each
- thread. Each thread switches logs independently.
- For each logfile, there is a controlfile record that describes it. The
- index of a log's controlfile record is referred to as its log number.
- Note that log numbers are equivalent to log group numbers, and are
- globally unique (across all threads). The list of a thread's logfile
- records is anchored in the thread record (i.e. via head and tail
- logfile record indices), and linked through the logfile records, each
- of which stores the thread number. The logfile record also has fields
- identifying the number of group members, as well as the head and
- tail (i.e. filename record indices) of the list (linked through
- filename records) of filenames in the group.
- 2.9 Redo Byte Address (RBA)
- An RBA points to a specific location in a particular redo thread. It
- is ten bytes long and has three components: log sequence number,
- block number within log, and byte number within block.
- 2.10 Checkpoint Structure
- The checkpoint structure is a data structure that defines a point in
- all the redo ever generated for a database. Checkpoint structures
- are stored in datafile headers and in the per-thread records of the
- controlfile. They are used by recovery to know where to start
- reading the log thread(s) for redo application.
- The key fields of the checkpoint structure are the checkpoint SCN
- and the enabled thread bitvec.
- The checkpoint SCN effectively demarcates a specific location in
- each enabled thread (for a definition of enabled see 3.11). For each
- thread, this location is where redo was being generated at some
- point in time within the resolution of one commit. The redo record
- headers in the log can be scanned to find the first redo record that
- was allocated at the checkpoint SCN or higher.
- The enabled thread bitvec is a mask defining which threads were
- enabled at the time the checkpoint SCN was allocated. Note that a
- bit is set for each thread that was enabled, regardless of whether it
- was open or closed. Every thread that was enabled has a redo log
- that contains the checkpoint SCN. A log containing this SCN is
- guaranteed to exist (either online or archived).
- The checkpoint structure also stores the time that the checkpoint
- SCN was allocated. This timestamp is only used to print a message
- to aid a person looking for a log.
- In addition, the checkpoint structure stores the number of the
- thread that allocated the checkpoint SCN and the current RBA in
- that thread when the checkpoint SCN was allocated. Having an
- explicitly-stored thread RBA (as opposed to only having the
- checkpoint SCN as an implicit thread location "pointer") makes the
- log sequence number (part of the RBA) and archived log name
- readily available for the single-instance (i.e. single-thread, non
- Parallel Server) case.
- A checkpoint structure for a port that supports up to 1023 threads
- of redo is 150 bytes long. A VMS checkpoint is 30 bytes and
- supports up to 63 threads of redo.
- 2.11 Log History
- The controlfile can be configured (using the MAXLOGHISTORY
- clause of the CREATE DATABASE or CREATE CONTROLFILE
- command) to contain a history record for every logfile that is
- completed. Log history records are small (24 bytes on VMS). They
- are overwritten in a circular fashion so that the oldest information
- is lost.
- For each logfile, the log-history controlfile record contains the
- thread number, log sequence number, low SCN, low SCN
- timestamp, and next SCN (i.e. low SCN of the next log in
- sequence). The purpose of the log history is to reconstruct archived
- logfile names from an SCN and thread number. Since a log
- sequence number is contained in the checkpoint structure (part of
- the RBA), single thread (i.e. non-Parallel Server) databases do not
- need log history to construct archived log names.
- The fields of the log history records are viewable via the
- V$LOG_HISTORY "fixed-view" (see Section 9 for a description
- of the recovery-related "fixed-views"). Additionally,
- V$RECOVERY_LOG, which displays information about archived
- logs needed to complete media recovery, is derived from
- information in the log history records. Although log history is not
- strictly needed for easy administration of single-instance (non-
- Parallel Server) databases, enabling use of V$LOG_HISTORY and
- V$RECOVERY_LOG might be a reason to configure it.
- 2.12 Thread Checkpoint Structure
- Each enabled thread's controlfile record contains a checkpoint
- structure called the thread checkpoint. The SCN field in this
- structure is known as the thread checkpoint SCN. The thread
- number and RBA fields in this structure refer to the associated
- thread.
- The thread checkpoint structure is updated each time an instance
- checkpoints its thread (see 3.4). During such thread checkpoint
- events, the instance associated with the thread writes to disk in the
- online datafiles all dirty buffers modified by redo generated before
- the thread checkpoint SCN.
- A thread checkpoint event guarantees that all pre-thread-
- checkpoint-SCN redo generated in that thread for all online
- datafiles has been written to disk. (Note that if the thread is closed,
- then there is no redo beyond the thread checkpoint SCN; i.e. the
- RBA points just past the last redo record in the current log.)
- It is the job of instance recovery to ensure that all of the thread's
- redo for all online datafiles is applied. Because of the guarantee
- that all of the thread's redo prior to the thread checkpoint SCN has
- already been applied, instance recovery can make the guarantee
- that, by starting redo application at the thread checkpoint SCN, and
- continuing through end-of-thread, all of the thread's redo will have
- been applied.
- 2.13 Database Checkpoint Structure
- The database checkpoint structure is the thread checkpoint of the
- thread that has the lowest checkpoint SCN of all the open threads.
- The number of the database checkpoint thread - the number of
- the thread whose thread checkpoint is the current database
- checkpoint - is recorded in the database info record of the
- controlfile. If there are no open threads, then the database
- checkpoint is the thread checkpoint that contains the highest
- checkpoint SCN of all the enabled threads.
- Since each instance guarantees that all redo generated before its
- own thread checkpoint SCN has been written, and since the
- database checkpoint SCN is the lowest of the thread checkpoint
- SCNs, it follows that all pre-database-checkpoint-SCN redo in all
- instances has been written to all online datafiles.
- Thus, all pre-database-checkpoint-SCN redo generated in all
- threads for all online datafiles is guaranteed to be in the files on
- disk already. This is described by saying that the online datafiles
- are checkpointed at the database checkpoint. This is the rationale
- for using the database checkpoint to update the online datafile
- checkpoints (see below) when an instance checkpoints its thread
- (see 3.4).
- 2.14 Datafile Checkpoint Structure
- The header of each datafile contains a checkpoint structure known
- as the datafile checkpoint. The SCN field in this structure is known
- as the datafile checkpoint SCN.
- All pre-checkpoint-SCN redo generated in all threads for a given
- datafile is guaranteed to be in the file on disk already. An online
- datafile has its checkpoint SCN replicated in its controlfile record.
- Note: Oracle's recovery layer code is designed to "tolerate" a
- discrepancy in checkpoint SCN between the file header and the
- controlfile record. These values could get out of sync should an
- instance failure occur between the time the file header was updated
- and the time the controlfile "transaction" committed. (Note: A
- controlfile "transaction" is an RDBMS internal mechanism,
- independent of the Oracle transaction layer, that allows an
- arbitrarily large update to the controlfile to be "committed"
- atomically.)
- The execution of a datafile checkpoint (see 3.6) for a given datafile
- updates the checkpoint structure in the file header, and guarantees
- that all pre-checkpoint-SCN redo generated in all threads for that
- datafile is on disk already.
- A thread checkpoint event (see 3.4) guarantees that all pre-
- database-checkpoint-SCN redo generated in all threads for all
- online datafiles has been written to disk. The execution of a thread
- checkpoint may advance the database checkpoint (e.g. in the
- single-instance case; or if the thread having the oldest checkpoint
- changed from being the current thread to another thread). If the
- database checkpoint does advance, then the new database
- checkpoint is used to update the datafile checkpoints of all the
- online datafiles (except those in hot backup: see Section 4).
- It is the job of media recovery (see Section 6) to ensure that all redo
- for a recovery-datafile (i.e. a datafile being media-recovered)
- generated in any thread through the recovery end-point is applied.
- Because of the guarantee that all recovery-datafile-redo generated
- in any enabled thread prior to that datafile's checkpoint SCN has
- already been applied, media recovery can make the guarantee that,
- by starting redo application in each enabled thread with the datafile
- checkpoint SCN and continuing through the recovery end-point
- (e.g. end-of-thread on all threads in the case of complete media
- recovery), all redo for the recovery-datafile from all threads will
- have been applied.
- Since the datafile checkpoint is stored in the header of the datafile
- itself, it is also present in backup copies of the datafile. It is the job
- of hot backup (see Section 4) to ensure that - despite the
- occurrence of ongoing updates to the datafile during the backup
- copy operation - the version of the datafile's checkpoint captured
- in the backup copy satisfies the checkpoint-SCN guarantee with
- respect to the versions of the datafile's datablocks captured in the
- backup copy.
- 2.15 Stop SCN
- Each datafile's controlfile record has a field called the stop SCN. If
- the file is offline or read-only, the stop SCN is the SCN beyond
- which no further redo exists for that datafile. If the file is online and
- any instance has the database open, the stop SCN is set to
- "infinity." The stop SCN is used during media recovery to
- determine when redo application for a particular datafile can stop.
- This ensures that media recovery will terminate when recovering
- an offline file while the database is open.
- The stop SCN is set whenever a datafile is taken offline or set read-
- only. This is true whether the offline was "immediate" (due to an I/
- O error, or due to taking the file's tablespace offline "immediate"),
- "temporary" (due to taking the file's tablespace offline
- "temporary"), or "normal" (due to taking the file's tablespace
- offline "normal"). However, in the case of a datafile taken offline
- "immediate," there is no file checkpoint (see 3.6), and dirty buffers
- are discarded. Hence, media recovery may need to apply redo from
- before the stop SCN in order to bring the datafile online. However,
- media recovery does not need to look for redo after the stop SCN,
- since it does not exist. If the stop SCN is equal to the datafile
- checkpoint SCN, then the file does not need recovery.
- 2.16 Checkpoint Counter
- There is a checkpoint counter kept in both the datafile header and
- in the datafile's controlfile record. Its purpose is to allow detection
- of the fact that a datafile or controlfile is a restored backup.
- The checkpoint counter is incremented every time checkpoints of
- online files are being advanced (e.g. by thread checkpoint). Thus
- the datafile's checkpoint counter is incremented even though the
- datafile's checkpoint is not being advanced because the file is in hot
- backup (see Section 4), or because its checkpoint SCN is already
- beyond that of the intended checkpoint (e.g. the file is new or has
- undergone a recent datafile checkpoint).
- The old value of the checkpoint counter - matching the
- checkpoint counter in the datafile's controlfile record - is also
- remembered in the file header. It is usually one less than the current
- counter in the header, but may differ from the current counter by
- more than one if the previous file header update failed after the
- header was written but before the controlfile "transaction"
- committed.
- A mismatch in checkpoint counters between the datafile header and
- the datafile's controlfile record is used to detect when a backup
- datafile (or a backup controlfile) has been restored.
- 2.17 Tablespace-Clean-Stop SCN
- TS$, a data dictionary table that describes tablespaces, has a
- column called the tablespace-clean-stop-SCN. It identifies an SCN
- at which a tablespace was taken offline or set read-only "cleanly":
- i.e. after checkpointing its datafiles (see 3.6). The SCN at which the
- datafiles are checkpointed is recorded in TS$ as the
- tablespace-clean-stop SCN. It allows such a "clean-stopped"
- tablespace to survive (i.e. not need to be dropped after) a
- RESETLOGS open (see 8.6). During media recovery, prior to
- resetlogs, the "clean-stopped" tablespace would be set offline.
- After resetlogs, the tablespace - which needs no recovery - is
- permitted to be brought online and/or set read-write. (An
- immediate backup of the tablespace is recommended).
- The tablespace-clean-stop SCN is set to zero (after being set
- momentarily to "infinity" during datafile state transition) when
- bringing an offline-clean tablespace online, or setting a read-only
- tablespace read-write. The tablespace-clean-stop SCN is also
- zeroed when taking a tablespace offline "immediate" or
- "temporary."
- A tablespace that has a non-zero tablespace-clean-stop SCN in TS$
- is clean at that SCN: the tablespace currently contains all redo up
- through that SCN, and no redo for the tablespace beyond that SCN
- exists. If the tablespace's datafiles are still in the state they had
- when the tablespace was taken offline "normal" or set read-only -
- i.e. they are not restored backups, are not fuzzy, and are
- checkpointed at the clean-stop SCN - then the tablespace can be
- brought online without recovery. Note that the semantics of the
- tablespace-clean-stop SCN differ from those of a constituent
- datafile's stop SCN in the datafile's controlfile record. The
- controlfile stop SCN designates an SCN beyond which no redo for
- the datafile exists. This does not imply that the datafile currently
- contains all redo up through that SCN.
- The tablespace-clean-stop SCN is stored in TS$ rather than in the
- controlfile so that it is covered by redo and will finish in the correct
- state - i.e. reflecting the correct online/offline state of the
- tablespace - following an incomplete recovery (see 6.12). Its
- value will not be lost if a backup controlfile is restored, or if a new
- controlfile is created. Furthermore, the presence of the tablespace-
- clean-stop SCN in TS$ allows an offline normal (or read-only)
- tablespace to survive (not need to be dropped after) a
- RESETLOGS open, since it is known that no redo application is
- needed to bring it online (see 8.6 for more detail). Thus, for
- example, an offline normal (or read-only) tablespace that was
- offline during an incomplete recovery can be brought online (or set
- read-write) subsequent to a RESETLOGS open. Without the
- tablespace-clean-stop SCN, there would be no way of knowing that
- the tablespace does not need recovery using redo that was
- discarded by the resetlogs. The only alternative would have been to
- force the tablespace to be dropped.
- 2.18 Datafile Offline Range
- The offline-start SCN and offline-end checkpoint fields of the
- controlfile datafile record describe the offline range. If valid, they
- delimit a log range guaranteed not to contain any redo for the
- datafile. Thus, media recovery can skip this log range when
- recovering the datafile, obviating the need to access old archived
- log data (which may be uavailable or unusable due to resetlogs: see
- Section 7). This optimization aids in recovering a datafile that is
- presently online (or read-write), but that was offline-clean (or read-
- only) for a long time, and whose last backup dates from that time.
- For example, this would be the case if, after a RESETLOGS open,
- an offline normal (or read-only) tablespace had been brought online
- (or set read-write), but not yet backed up.
- When a datafile transitions from offline-clean to online (or from
- read-only to read-write), the offline range is set as follows: The
- offline-start SCN is set from the tablespace-clean-stop SCN saved
- when setting the file offline (or read-only). The offline-end
- checkpoint is set from the file checkpoint taken when setting the
- file online (or read-write).
- 3 Redo Generation
- Redo is generated to describe all changes made to database blocks.
- This section describes the various operations that occur while the
- database is open and generating redo.
- 3.1 Atomic Changes
- The most fundamental operation is to atomically change a set of
- datablocks. A foreground process intending to change one or more
- datablocks first acquires exclusive access to cache buffers
- containing those blocks. It then constructs the change vectors
- describing the changes. Space is allocated in the redo log buffer to
- hold the redo record. The redo log buffer - the buffer from which
- LGWR writes the redo log - is located in the SGA (System
- Global Area). It may be necessary to ask LGWR to write the buffer
- to the redo log in order to make space. If the log is full, LGWR
- may need to do a log switch in order to make the space available.
- Note that allocating space in the redo buffer also allocates space in
- the logfile. Thus, even though the redo buffer has been written, it
- may not be possible to allocate redo log space. After the space is
- allocated, the foreground process builds the redo record in the redo
- buffer. Only after the redo record has been built in the redo buffer
- may the datablock buffers be changed. Writing the redo to disk is
- the real change to the database. Recovery ensures that all changes
- that make it into the redo log make it into the datablocks (except in
- the case of incomplete recovery).
- 3.2 Write-Ahead Log
- Write-ahead log is a cache-enforced protocol governing the order
- in which dirty datablock buffers are written vs. when the redo log
- buffer is written. According to write-ahead log protocol, before
- DBWR can write out a cache buffer containing a modified
- datablock, LGWR must write out the redo log buffer containing
- redo records describing changes to that datablock.
- Note that write-ahead log is independent of log-force-at-commit
- (see 3.3).
- Note also that write-ahead log protocol only applies to datafile
- writes that originate from the buffer cache. In particular, write-
- ahead log does not apply to so-called direct path writes (e.g.
- originating from direct path load, table create via subquery, or
- index create). Direct path writes (targeted above the segment high-
- water mark) originate not as writes out of the buffer cache, but as
- bulk-writes out of the foreground process' data space. Indeed,
- correct handling of direct path writes by media recovery dictates a
- write-behind-log protocol. (The basic reason is that, because the
- bulk-writes do not go through the buffer cache, there is no
- mechanism to guarantee their completion at checkpoint).
- One guarantee made by write-ahead log protocol is that there are
- no changes in the datafiles that are not in the redo log, regardless of
- intervening failure. This is what enables recovery to preserve the
- guarantee of redo record atomicity despite intervening failure.
- Another guarantee made by write-ahead log protocol is that no
- datablock change can be written to disk without first writing to the
- redo log sufficient information to enable the change to be undone
- should the transaction fail to commit. That undo-enabling
- information is written to the redo log in the form of "redo" for the
- rollback segment.
- Write-ahead log protocol plays a key role in enabling the
- transaction layer to preserve the guarantee of transaction atomicity
- despite intervening failure.
- 3.3 Transaction Commit
- Transaction commit allocates an SCN and builds a commit redo
- record containing that SCN. The commit is complete when all of
- the transaction's redo (including commit redo record) is on disk in
- the log. Thus, commit forces the redo log to disk - at least up to
- and including the transaction's commit record. This is termed log-
- force-at-commit.
- Recovery is designed such that it is sufficient to write only the redo
- log at commit time - rather than all datablocks changed by the
- transaction - in order to guarantee transaction durability despite
- intervening failure. This is termed no-datablock-force-at-commit.
- 3.4 Thread Checkpoint
- A thread checkpoint event, executed by the instance associated
- with the redo thread being checkpointed, forces to disk all dirty
- buffers in that instance that contain changes to any online datafile
- before a designated SCN - the thread checkpoint SCN. Once all
- redo in the thread prior to the checkpoint SCN has been written to
- disk, the thread checkpoint structure in the thread's controlfile
- record is updated in a controlfile transaction.
- When a thread checkpoint begins, an SCN is captured and a
- checkpoint structure is initialized. Then all the dirty buffers in the
- instance's cache are marked for checkpointing. DBWR proceeds to
- write out the marked buffers in a staged manner. Once all the
- marked buffers have been written, the SCN in the checkpoint
- structure is set to the captured SCN, and the thread checkpoint
- structure in the thread's controlfile record is updated in a controlfile
- transaction.
- A thread checkpoint might or might not advance the database
- checkpoint. If only one thread is open, the new checkpoint is the
- new database checkpoint. If multiple threads are open, the database
- checkpoint will advance if the local thread is the current database
- checkpoint. Since the new checkpoint SCN was allocated recently,
- it is most likely greater than the thread checkpoint SCN in some
- other open thread. If it advances, the database checkpoint becomes
- the new lowest-SCN open thread checkpoint. If the old checkpoint
- SCN for the local thread was higher than the current checkpoint
- SCN of some other open thread, then the database checkpoint does
- not change.
- If the database checkpoint is advanced, then the checkpoint counter
- is advanced in every online datafile header. Furthermore, for each
- online datafile that is not in hot backup (see Section 4), and not
- already checkpointed at a higher SCN (e.g. as would be the case for
- a recently added or recovered file), the datafile header checkpoint is
- advanced to the new database checkpoint, and the file header is
- written to disk. Also, the checkpoint SCN in the datafile's
- controlfile record is advanced to the new database checkpoint SCN.
- 3.5 Online-Fuzzy Bit
- Note that more changes - beyond those already in the marked
- buffers - may be generated after the start of checkpoint. Such
- changes would be generated at SCNs higher than the SCN that will
- be recorded in the file header. They could either be changes to
- marked buffers that were added since checkpoint start, or else
- changes to unmarked buffers. Buffers containing these changes
- could written out for a variety of reasons. Thus, the online files are
- online-fuzzy; that is, they generally contain changes in the future of
- (i.e. generated at higher SCNs than) their header checkpoint SCN.
- A datafile is virtually always online-fuzzy while it is online and the
- database is open.
- Online-fuzzy state is indicated by setting the so-called online-fuzzy
- bit in the datafile header. The online-fuzzy bits of all online
- datafiles are set at database open time. Also, when a datafile is
- brought online while the database is open, its online-fuzzy bit is
- set.
- The online-fuzzy bits are cleared after the last instance does a
- shutdown "normal" or "immediate." Other occasions for clearing
- the online-fuzzy bits are: (i) the finish of crash recovery; (ii) when
- media recovery "checkpoints" (flushes its buffers) after
- encountering an end-crash-recovery redo record (see 5.5); (iii)
- when taking a datafile offline "temporary" or "normal" (i.e. an
- offline operation that is preceded by a file checkpoint); (iv) when
- BEGIN BACKUP is issued (see 4.1).
- As will be seen in 8.1, open with resetlogs will fail if any online
- datafile has the online-fuzzy bit (or any fuzzy bit) set.
- 3.6 Datafile Checkpoint
- A datafile checkpoint event, executed by all open instances (for all
- open threads), forces to disk all dirty buffers in any instance that
- contain changes to a particular datafile (or set of datafiles) before a
- designated SCN - the datafile checkpoint SCN. Once all datafile-
- related redo from all open threads prior to the checkpoint SCN has
- been written to disk, the datafile checkpoint structure in the file
- header is updated and written to disk.
- Datafile checkpoints occur as part of operations such as beginning
- hot backup (see Section 4) and offlining datafiles as part of taking a
- tablespace offline normal (see 2.17).
- 3.7 Log Switch
- When an instance needs to generate more redo but cannot allocate
- enough blocks in the current log, it does a log switch. The first step
- in a log switch is to find an online log that is a candidate for reuse.
- The first requirement for the candidate log is that it must not be
- active: i.e. it must not be needed for crash/instance recovery. In
- other words, it must be overwritable without losing redo data
- needed for instance recovery. The principle enforced is that a
- logfile cannot be reused until the current thread checkpoint is
- beyond that logfile. Since instance recovery starts at the current
- thread checkpoint SCN/RBA (and expects to find that RBA in an
- online redo log), the ability to do instance recovery using only
- online logs translates into the requirement that the current thread
- checkpoint SCN be beyond the highest SCN associated with redo
- in the candidate log. If this is not the case, then the thread
- checkpoint currently in progress - e.g. the one started when the
- candidate log was originally switched into (see below) - is
- hurried up to complete.
- The other requirement for the candidate log is that it does not need
- archiving. Of course, this requirement only applies to a database
- running in ARCHIVELOG mode. If archiving is required, the
- archiver is posted.
- As soon as the log switch completes, a new thread checkpoint is
- started in the new log. Hopefully, the checkpoint will complete
- before the next log switch is needed.
- 3.8 Archiving Log Switches
- Each thread switches logs independently. Thus, when running
- Parallel Server, an SCN is almost never at the beginning of a log in
- all threads. However, it is desirable to have roughly the same range
- of SCNs in the archived logs of all enabled threads. This ensures
- that the last log archived in each thread is reasonably current. If an
- unarchived log for an enabled thread contained a very old SCN (as
- would occur in the case of a relatively idle instance), it would not
- be possible to use archived logs from a primary site to do recovery
- to a higher SCN at a standby site. This would be true even if the log
- with the low SCN contained no redo.
- This problem is solved by forcing log switches in other threads
- when their current log is significantly behind the log just archived.
- For the case of an open thread, a lock is used to "kick" the laggard
- instance into switching logs and archiving when it can. For the case
- of a closed thread, the archiving process in the active instance does
- the closed thread's log switch and archiving for it. Note that this
- can result in a thread that is enabled but never used having a bunch
- of archived logs with only a file header. A force archiving SCN is
- maintained in the database info controlfile record to implement this
- feature. The system strives to archive any log that contains that
- SCN or less. In general, the log with the lowest SCN is archived
- first.
- The command ALTER SYSTEM ARCHIVE LOG CURRENT can
- be used to manually archive the current logs of all enabled threads.
- It forces all threads, open and closed, to switch to a new log. It
- archives what is necessary to ensure all the old logs are archived. It
- does not return until all redo generated before the command was
- entered is archived. This command is useful for ensuring all redo
- logs necessary for the recovery of a hot backup are archived. It is
- also useful for ensuring the potential currency of a standby site in a
- configuration in which archived logs from a primary site are
- shipped to a standby site for application by recovery in case of
- disaster (i.e. "standby database").
- 3.9 Thread Open
- When an instance opens the database, it needs to open a thread for
- redo generation. The thread is chosen at mount time. A system
- initialization parameter can be used to specify the thread to mount
- by number. Otherwise, any available publicly-enabled thread can
- be chosen by the instance at mount time. A thread-mounted lock is
- used to prevent two instances from mounting the same thread.
- When an instance opens a thread, it sets the thread-open flag in the
- thread's controlfile record. While the instance is alive, it holds a set
- of thread-opened locks (one held by each of LGWR, DBWR,
- LCK0, LCK1, ...). (These are released at instance death, enabling
- one instance to detect the death of another in the Parallel Server
- environment: see 5.1). Also at thread open time, a new checkpoint
- is captured and used for the thread checkpoint. If this is the first
- database open, this becomes the new database checkpoint, ensuring
- all online files have their header checkpoints advanced at open
- time. Note that a log switch may be forced at thread open time.
- 3.10 Thread Close
- When an instance closes the database, or when a thread is
- recovered by instance/crash recovery, the thread is closed. The first
- step in closing a thread is to ensure that no more redo is generated
- in it. The next step is to ensure that all changes described by
- existing redo records are in the online datafiles on disk. In the case
- of normal database close, this is accomplished by doing a thread
- checkpoint. The SCN from this final thread checkpoint is said to be
- the "SCN at which the thread was closed." Finally, the thread's
- controlfile record is updated to clear the thread-open flag.
- In the case of thread close by instance recovery, the presence in the
- online datafiles of all changes described by thread redo records is
- ensured by starting redo application at the most recent thread
- checkpoint and continuing through end-of-thread. Once all changes
- described by thread redo records are in the online datafiles, the
- thread checkpoint is advanced to the end-of-thread. Just as in the
- case of a normal thread checkpoint, this checkpoint may advance
- the database checkpoint. If this is the last thread close, the database
- checkpoint thread field in the database info controlfile record -
- which normally points to an open thread - will be left pointing at
- this thread, even though it is closed.
- 3.11 Thread Enable
- In order for a thread to be opened, it must be enabled. This ensures
- that its redo will be found during media recovery. A thread may be
- enabled in either public or private mode. A private thread can only
- be mounted by an instance that specifies it in the THREAD system
- initialization parameter. This is analogous to rollback segments. A
- thread must have at least two online redo log groups while it is
- enabled. An enabled thread always has one online log that is its
- current log. The next SCN of the current log is infinite, so that any
- new SCN allocated will be within the current log. A special thread-
- enable redo record is written in the thread of an instance enabling a
- new thread (i.e. via ALTER DATABASE ENABLE THREAD).
- The thread-enable redo record is used by media recovery to start
- applying redo from the new thread. Note that this means it takes an
- open thread to enable another thread. This chicken and egg
- problem is resolved by having thread one automatically enabled
- publicly at database creation. This also means that databases that
- do not run in Parallel Server mode do not need to enable a thread.
- 3.12 Thread Disable
- If a thread is not going to be used for a long while, it is best to
- disable it. This means that media recovery will not expect any redo
- to be found in the thread. Once a thread is disabled, its logs may be
- dropped. A thread must be closed before it can be disabled. This
- ensures all its changes have been written to the datafiles. A new
- SCN is allocated to save as the next SCN for the current log. The
- log header is marked with this SCN and flags saying it is the end of
- a disabled thread. It is important that a new current SCN is
- allocated. This ensures the SCN in any checkpoint with this thread
- enabled will appear in one of the logs from the thread. Note that
- this means a thread must be open in order to disable another thread.
- Thus, it is not possible to disable all threads.
- 4 Hot Backup
- A hot backup is a copy of a datafile that is taken while the file is in
- active use. Datafile writes (by DBWR) go on as usual during the
- time the backup is being copied. Thus, the backup gets a "fuzzy"
- copy of the datafile:
- 7 Some blocks may be ahead in time versus other blocks of the
- copy.
- 7 Some blocks of the copy may be ahead of the checkpoint SCN
- in the file header of the copy.
- 7 Some blocks may contain updates that constitute breakage of
- the redo record atomicity guarantee with respect to other
- blocks in this or other datafiles.
- 7 Some block copies may be "fractured" (due to front and back
- halves being copied at different times, with an intervening
- update to the block on disk).
- The "hotbackup-fuzzy" copy is unusable without "focusing" (via
- the redo log) that occurs when the backup is restored and
- undergoes media recovery. Media recovery applies redo (from all
- threads) from the begin-backup checkpoint SCN (see Step 2. in
- Section 4.1) through the end-point of the recovery operation (either
- complete or incomplete). The result is a transaction-consistent
- "focused" version of the datafile.
- There are three steps to taking a hot backup:
- 7 Execute the ALTER TABLESPACE ... BEGIN BACKUP
- command.
- 7 Use an operating system copy utility to copy the constituent
- datafiles of the tablespace(s).
- 7 Execute the ALTER TABLESPACE ... END BACKUP com-
- mand.
- 4.1 BEGIN BACKUP
- The BEGIN BACKUP command takes the following actions (not
- necessarily in the listed order) for each datafile of the tablespace:
- 1. It sets a flag in the datafile header - the hotbackup-fuzzy bit
- - to indicate that the file is in hot backup. The header with
- this flag set (copied by the copy utility) enables the copy to be
- recognized as a hot backup. A further purpose of this flag in
- the online file header is to cause the checkpoint in the file
- header to be "frozen" at the begin-backup checkpoint value
- that will be set in Step 4. This is the value that it must have in
- the backup copy in order to ensure that, when the backup is
- recovered, media recovery will start redo application at a suffi-
- ciently early checkpoint SCN so as to cover all changes to the
- file in all threads since the execution of BEGIN BACKUP (see
- 6.5). Since we cannot guarantee that the file header will be the
- first block to be written out by the copy utility, it is important
- that the file header checkpoint structure remain "frozen" until
- END BACKUP time. This flag keeps the datafile checkpoint
- structure "frozen" during hot backup, preventing it (and the
- checkpoint SCN in the datafile's controlfile record) from being
- updated during thread checkpoint events that advance the
- database checkpoint. New in v7.2: While the file is in hot
- backup, a new "backup" checkpoint structure in the datafile
- header receives the updates that the "frozen" checkpoint
- would have received.
- 2. It executes a datafile checkpoint, capturing the resultant
- "begin-backup" checkpoint information, including the begin-
- backup checkpoint SCN. When the file is checkpointed, all
- instances are requested to write out all dirty buffers they have
- for the file. If the need for instance recovery is detected at this
- time, the file checkpoint operation waits until it is completed
- before proceeding. Checkpointing the file at begin-backup
- time ensures that only file blocks changed after begin-backup
- time might have been written to disk during the course of the
- file copy. This guarantee is crucial to enabling block before-
- image logging to cope with the fractured block problem, as
- described in Step 3.
- 3. [Platform-dependent option]: It starts block before-image log-
- ging for the file. During block before-image logging, all
- instances log a full block before-image to the redo log prior to
- the first change to each block of the file (since the backup
- started, or since the block was read anew into the buffer
- cache). This is to forestall a recovery problem that would arise
- if the backup were to contain a fractured block copy (mis-
- matched halves). This could happen if (the database block size
- is greater than the operating system block size, and) the front
- and back halves of the block were copied to the backup at dif-
- ferent times - with an intervening update to the block on
- disk. In this eventuality, recovery can reconstruct the block
- using the logged block before-image.
- 4. It sets the checkpoint in the file header equal to the begin-
- backup checkpoint captured in Step 2. This file header check-
- point will be "frozen" until END BACKUP is executed.
- 5. It clears the file's online-fuzzy bit. The online-fuzzy bit
- remains clear during the course of the file copy operation, thus
- ensuring a cleared online-fuzzy bit in the file copy. Note that
- the online-fuzzy bit is set again by the execution of END
- BACKUP.
- 4.2 File Copy
- The file copy is done by utilities that are not part of Oracle. The
- presumption is that the platform vendor will have backup facilities
- that are superior to any portable facility that we could develop. It is
- the responsibility of the administrator to ensure that copies are only
- taken between the BEGIN BACKUP and END BACKUP
- commands, or when the file is not in use.
- 4.3 END BACKUP
- The END BACKUP command takes the following actions for each
- datafile of the tablespace:
- 1. It restores (i.e. sets) the file's online-fuzzy bit.
- 2. It creates an end-backup redo record (end-backup "marker")
- for the datafile. This record, interpreted only by media recov-
- ery, contains the begin-backup checkpoint SCN (i.e. the SCN
- matching that in the "frozen" checkpoint in the backup's
- header). This record serves to mark the end of the redo gener-
- ated during the backup. The end-backup "marker" is used by
- media recovery to determine when all redo generated between
- BEGIN BACKUP and END BACKUP has been applied to the
- datafile. Upon encountering the end-backup "marker", media
- recovery can (at the next media recovery checkpoint: see
- 6.7.1) clear the hotbackup-fuzzy bit. This is only important in
- preventing an incomplete recovery that might erroneously
- attempt to end before all redo generated between BEGIN
- BACKUP and END BACKUP has been applied. Ending
- incomplete recovery at such a point may result in an inconsis-
- tent file, since the backup copy may already have contained
- changes beyond this endpoint. As will be seen on 8.1, open
- with resetlogs following incomplete media recovery will fail if
- any online datafile has the hotbackup-fuzzy bit (or any other
- fuzzy bit) set.
- 3. It clears the file's hotbackup-fuzzy bit.
- 4. It stops block before-image logging for the file.
- 5. It advances the file checkpoint to the current database check-
- point. This compensates for any file header update(s) missed
- during thread checkpoints that may have advanced the data-
- base checkpoint while the file was in hot backup state, with its
- checkpoint "frozen".
- 4.4 "Crashed" Hot Backup
- A normal shutdown of the instance that started a backup, or the last
- remaining instance, is not allowed while any files are in hot
- backup. Nor may a file in backup be taken offline normal or
- temporary. This is to ensure an end-backup "marker" is generated
- whenever possible, and to make administrators aware that they
- forgot to issue the END BACKUP command, and that the backup
- copy is unusable.
- When an instance failure or shutdown abort leaves a hot backup
- operation incomplete (i.e. lacking termination via END BACKUP),
- any file that was in backup before the failure has its hotbackup-
- fuzzy bit set and its checkpoint "frozen" at the begin-backup
- checkpoint. Even though the online file's datablocks are actually
- current to the database checkpoint, the file's header makes it look
- like a restored backup that needs media recovery and is current
- only to the begin-backup checkpoint. Crash recovery will fail -
- claiming media recovery is required - if it encounters an online
- file in "crashed" hot backup state. The file does not actually need
- media recovery, however, but only an adjustment to its file header
- to take it out of "crashed" hot backup state.
- Media recovery could be used to recover and allow normal open of
- a database that has files left in "crashed" hot backup state. For v7.2
- however, a preferable option - because it requires no archived
- logs - is to use the (new in v7.2) command ALTER DATABASE
- DATAFILE... END BACKUP on the files left in "crashed" hot
- backup state (identifiable using the V$BACKUP fixed-view: see
- 9.6). Following execution of this command, crash recovery will
- suffice to open the database. Note that the ALTER TABLESPACE
- ... END BACKUP format of the command cannot be used when the
- database is not open. This is because the database must be open in
- order to translate (via the data dictionary) tablespace names into
- their constituent datafile names.
- 5 Instance Recovery
- Instance recovery is used to recover from both crash failures and
- Parallel Server instance failures. Instance recovery refers either to
- crash recovery or to Parallel Server instance recovery (where a
- surviving instance recovers when one or more other instances fail).
- The goal of instance recovery is to restore the datablock changes
- that were in the cache of the dead instance and to close the thread
- that was left open. Instance recovery uses only online redo logfiles
- and current online datafiles (not restored backups). It recovers one
- thread at a time, starting at the most recent thread checkpoint and
- continuing until end-of-thread.
- 5.1 Detection of the Need for Instance Recovery
- The kernel performs instance recovery automatically upon
- detecting that an instance died leaving its thread-open flag set in
- the controlfile. Instance recovery is performed automatically on
- two occasions:
- 1. at the first database open after a crash (crash recovery);
- 2. when some but not all instances of a Parallel Server fail.
- In the case of Parallel Server, a surviving instance detects the need
- to perform instance recovery for one or more failed instances by
- the following means:
- 1. A foreground process in a surviving instance detects an
- "invalid block lock" condition when it attempts to bring a
- datablock into the buffer cache. This is an indication that
- another instance died while a block covered by that lock was
- in a potentially "dirty" state in its buffer cache.
- 2. The foreground process sends a notification to its instance's
- SMON process, which begins a search for dead instances.
- 3. The death of another instance is detected if the current
- instance is able to acquire that instance's thread-opened locks
- (see 3.9).
- SMON in the surviving instance obtains a stable list of dead
- instances, together with a list of "invalid" block locks. Note: After
- instance recovery is complete, locks in this list will undergo "lock
- cleanup" (i.e. they will have their "invalid" condition cleared,
- making the underlying blocks accessible again).
- 5.2 Thread-at-a-Time Redo Application
- Instance recovery operates by processing one thread at a time,
- thereby recovering one instance at a time. It applies all redo (from
- the thread checkpoint through the end-of-thread) from each thread
- before starting on the next thread. This algorithm depends on the
- fact that only one instance at a time can have a given block
- modified in its cache. Between changes to the block by different
- instances, the block is written to disk. Thus, a given block (as read
- from disk during instance recovery) can need redo applied from at
- most one thread - the thread containing the most recent
- modification.
- Instance recovery can always be accomplished using the online
- redo logs for the thread being recovered. Crash recovery operates
- on the thread with the lowest checkpoint SCN first. It proceeds to
- recover the threads in the order of increasing thread checkpoint
- SCNs. This ensures that the database checkpoint is advanced by
- each thread recovered.
- 5.3 Current Online Datafiles Only
- The checkpoint counters are used to ensure that the datafiles are the
- current online files rather than restored backups. If a backup copy
- of a datafile is restored, then media recovery is required.
- Media recovery is required for a restored backup even if recovery
- can be accomplished using the online logs. The reason is that crash
- recovery applies all post-thread-checkpoint redo from each thread
- before starting on the next thread. Crash recovery can use this
- thread-at-a-time redo application algorithm because a given
- datablock can need redo application from at most one thread.
- However, starting recovery from a restored backup enables no such
- assumption about the number of threads that have relevant redo.
- Thus, the thread-at-a-time algorithm would not work. Recovering a
- backup requires thread-merged redo application: i.e. application of
- all post-file-checkpoint redo, simultaneously merging redo from all
- threads in SCN order. This thread-merged redo application
- algorithm is the one used by media recovery (see Section 6).
- Crash recovery would not suffice - even with thread-merged redo
- application - to recover a backup datafile, even if it were
- checkpointed at the current database checkpoint. The reason is that
- in all but the database checkpoint thread, crash recovery would
- miss applying redo between the database checkpoint and the
- (higher) thread checkpoint. By contrast, media recovery would
- start redo application at the file checkpoint in all threads.
- Furthermore, crash recovery might fail even if it started redo
- application at the file checkpoint in all threads. The reason is that
- crash recovery assumes that it will need only online logfiles. All
- but the database checkpoint thread might have already archived
- and re-used a needed log.
- If the STARTUP RECOVER command is used (in place of simple
- STARTUP), and crash recovery fails due to datafiles needing
- media recovery (e.g. they are restored backups), then media
- recovery via RECOVER DATABASE (see 6.4.1) is automatically
- executed prior to database open.
- 5.4 Checkpoints
- Instance recovery does not attempt to apply redo that is before the
- checkpoint SCN of a datafile. (The datafile header checkpoint
- SCNs are not used to decide where to start recovery, however.)
- The redo from the thread checkpoint through the end-of-thread
- must be read to find the end-of-thread and the highest SCN
- allocated by the thread. These are then used to close the thread and
- advance the thread checkpoint. The end of a instance recovery
- almost always advances the datafile checkpoints, and always
- advances the checkpoint counters.
- 5.5 Crash Recovery Completion
- At the termination of crash recovery, the "fuzzy bits" - online-
- fuzzy, hotbackup-fuzzy, media-recovery-fuzzy - of all online
- datafiles are cleared. A special redo record, the end-crash-recovery
- "marker," is generated. This record is interpreted by media
- recovery to know when it is permissible to clear the online-fuzzy
- and hotbackup-fuzzy bits of the datafiles undergoing recovery (see
- 6.6).
- 6 Media Recovery
- Media recovery is used to recover from a lost or damaged datafile,
- or from a lost current controlfile. It is used to transform a restored
- datafile backup into a "current" datafile. It is also used to restore
- changes that were lost when a datafile went offline without a
- checkpoint. Media recovery can apply archived logs as well as
- online logs. Unlike instance or crash recovery, media recovery is
- invoked only via explicit command.
- 6.1 When to Do Media Recovery
- As was seen in 5.3, a restored datafile backup always needs media
- recovery, even if its recovery can be accomplished using only
- online logs. The same is true of a datafile that went offline without
- a checkpoint. The database cannot be opened if any of the online
- datafiles needs media recovery. A datafile that needs media
- recovery cannot be brought online until media recovery has been
- executed. Unless the database is not open by any instance, media
- recovery can only operate on offline files. Media recovery may be
- explicitly invoked to recover a database prior to open even when
- crash recovery would have sufficed. If so, crash recovery - though
- it may find nothing to do - will still be invoked automatically at
- database open. Note that media recovery may be run - and, in
- cases such as restored backups or datafiles that went offline
- immediate, must be run - even if recovery can be accomplished
- using only the online logs. Media recovery may find nothing to do
- - and signal the "no recovery required" error - if invoked for
- files that do not need recovery.
- If the current controlfile is lost and a backup controlfile is restored
- in its place, media recovery must be done. This is the case even if
- all of the datafiles are current.
- 6.2 Thread-Merged Redo Application
- Media recovery uses a thread-merged redo application algorithm:
- i.e. it applies redo from all threads simultaneously, merging redo
- records in increasing SCN order. The process of media-recovering
- a backup datafile differs from the process of crash-recovering a
- current online datafile in the following fundamental way: Crash
- recovery applies redo from one thread at a time because any block
- of a current online file can need redo from at most one thread (one
- instance at a time can dirty a block in cache). With a restored
- backup, however, no assumption can be made about the number of
- threads that have redo relevant to particular block. In general,
- recovering a backup requires simultaneous application of redo
- from all threads, with merging of redo records across threads in
- SCN order. Note that this algorithm depends on a redo-generation-
- time guarantee that changes for a given block occur in increasing
- SCN order across threads (case of Parallel Server).
- 6.3 Restoring Backups
- The administrator may copy backup versions of datafiles to the
- current datafile while the database is shut down or the file is offline.
- There is a strong assumption that backups are never copied to files
- that are currently accessible. Every file header read verifies that this
- has not been done by comparing the checkpoint counter in the file
- header with the checkpoint counter in the datafile's controlfile
- record.
- 6.4 Media Recovery Commands
- There are three media recovery commands:
- 7 RECOVER DATABASE
- 7 RECOVER TABLESPACE
- 7 RECOVER DATAFILE
- The only essential difference in these commands is in how the set
- of files to recover is determined. They all use the same criteria for
- determining if the files can be recovered. There is a lock per
- datafile that is held exclusive by a process doing media recovery on
- a file, and is held shared by an instance that has the database open
- with the file online. Media recovery signals an error if it cannot get
- the lock for a file it is asked to recover. This prevents two recovery
- sessions from recovering the same file, and prevents media
- recovery of a file that is in use.
- 6.4.1 RECOVER DATABASE
- This command does media recovery on all online datafiles that
- need any redo applied. If all instances were cleanly shutdown, and
- no backups were restored, this command will signal the "no
- recovery required" error. It will also fail if any instances have the
- database open, since they will have the datafile locks.
- 6.4.2 RECOVER TABLESPACE
- This command does media recovery on all datafiles in the
- tablespaces specified. In order to translate (i.e. via the data
- dictionary) the tablespace names into datafile names, the database
- must be open. This means that the tablespaces and their constituent
- datafiles must be offline in order to do the recovery. An error is
- signalled if none of the tablepace's constituent files needs recovery.
- 6.4.3 RECOVER DATAFILE
- This command specifies the datafiles to be recovered. The database
- may be open; or it may be closed, as long as the media recovery
- locks can be acquired. If the database is open in any instance, then
- datafile recovery can only recover offline files.
- 6.5 Starting Media Recovery
- Media recovery starts by finding the media-recovery-start SCN: i.e.
- the lowest SCN of the datafile header checkpoints of the files being
- recovered. Note: An exception occurs if a file's checkpoint is in its
- offline range (see 2.18). In that case, the file's offline-end
- checkpoint is used in place of its datafile header checkpoint in
- computing the media-recovery-start SCN.
- A buffer for reading redo is allocated for each thread in the enabled
- thread bitvec of the media-recovery-start checkpoint (i.e. the
- datafile checkpoint with the lowest SCN). The initial file header
- checkpoint SCN of every file is saved to ensure that no redo from a
- previous use of the file number is applied, as well as to eliminate
- needlessly attempting to apply redo to a file from before its
- checkpoint. The stop SCNs (from the datafiles' controlfile records)
- are also saved. If finite, the highest stop SCN can be used to allow
- recovery to terminate without needlessly searching for redo beyond
- that SCN to apply (see 6.10). At recovery completion, any datafile
- initially found to have a finite stop SCN will be left checkpointed at
- that stop SCN (rather than at the recovery end-point). This allows
- an offline-clean or read-only datafile to be left checkpointed at an
- SCN that matches the tablespace-clean-stop-SCN of its tablespace.
- 6.6 Applying Redo, Media Recovery Checkpoints
- A log is opened for each thread of redo that was enabled at the time
- the media-recovery-start SCN was allocated (i.e. for each thread in
- the enabled thread bitvec of the media-recovery-start checkpoint).
- If the log is online, then it is automatically opened. If the log was
- archived, then the user is prompted to enter the name of the log
- (unless automatic recovery is being used). The redo is applied from
- all the threads in the order it was generated, switching threads as
- needed. The order of application of redo records without an SCN is
- not precise, but it is good enough for rollback to make the database
- consistent.
- Except in the case of cancel-based incomplete recovery (see
- 6.12.1) and backup controlfile recovery (see 6.13), the next online
- log in sequence is accessed automatically, if it is on disk. If not, the
- user is prompted for the next log.
- At log boundaries, media recovery executes a "checkpoint." As
- part of media recovery checkpoint, the dirty recovery buffers are
- written to disk and the datafile header checkpoints of the files
- undergoing recovery are advanced, so that the redo does not need
- to be reapplied. Another type of media recovery "checkpoint"
- occurs when a datafile initially found to have a finite stop SCN
- reaches that stop SCN. At such a stop SCN boundary, all dirty
- recovery buffers are written to disk, and the datafiles that have been
- made current have their datafile header checkpoints advanced to
- their stop SCN values.
- 6.7 Media Recovery and Fuzzy Bits
- 6.7.1 Media-Recovery-Fuzzy
- The media-recovery-fuzzy bit is a flag in the datafile header that is
- used to indicate that - due to ongoing redo application by media
- recovery - the file may contain changes in the future of (at SCNs
- beyond) the current header checkpoint SCN. The media-recovery-
- fuzzy bit is set at the start of media recovery for each file
- undergoing recovery. Generally the media-recovery-fuzzy bits can
- be cleared when a media recovery checkpoint advances the
- checkpoints in the datafile headers. They are left clear when a
- media recovery session completes successfully or is cancelled. As
- will be seen on 8.1, open with resetlogs following incomplete
- media recovery will fail if any online datafile has the media-
- recovery-fuzzy bit (or any fuzzy bit) set.
- 6.7.2 Online-Fuzzy
- Upon encountering an end-crash-recovery "marker" (or a file-
- specific offline-immediate "marker": generated when a datafile
- goes offline without a checkpoint), media recovery can (at the next
- media recovery checkpoint) clear (if set) the online-fuzzy and
- hotbackup-fuzzy bits in the appropriate datafile header(s).
- 6.7.3 Hotbackup-Fuzzy
- Upon encountering an end-backup "marker" (or an end-crash-
- recovery "marker"), media recovery can (at the next media
- recovery checkpoint) clear the hotbackup-fuzzy bit. Open with
- resetlogs following incomplete media recovery will fail if any
- online datafile has the hotbackup-fuzzy bit (or any fuzzy bit) set.
- This prevents a successful RESETLOGS open following an
- incomplete recovery that terminated before all redo generated
- between BEGIN BACKUP and END BACKUP had been applied.
- Ending incomplete recovery at such a point would generally result
- in an inconsistent file, since the backup copy may already have
- contained changes between this endpoint and the END BACKUP.
- 6.8 Thread Enables
- A special thread-enable redo record is written in the thread of an
- instance enabling a new thread. If media recovery encounters a
- thread-enable redo record, it allocates a new redo buffer, opens the
- appropriate log in the new thread, and prepares to start applying
- redo from the new thread.
- 6.9 Thread Disables
- When a thread is disabled, its current log is marked as the end of a
- disabled thread. After media recovery finishes applying redo from
- such a log, it deallocates the thread's redo buffer and stops looking
- for redo from the thread.
- 6.10 Ending Media Recovery (Case of Complete Media Recovery)
- The current (i.e. last) log in every enabled thread has the end-of-
- thread flag set in its header. Complete (as opposed to incomplete:
- see 6.12) media recovery always continues redo application
- through the end-of-thread in all threads. The end-of-thread log can
- be identified without having the current controlfile, since the end-
- of-thread flag is in the log header rather than in the logfile's
- controlfile record.
- Note: Backing up and later restoring copies of current online logs
- is dangerous, and can lead to mis-identification of the current true
- end-of-thread. This is because the end-of-thread flag in the backup
- copy will in general be out-of-date with respect to the current end-
- of-thread log.
- If the datafiles being recovered have finite stop SCNs in their
- controlfile records (assuming a current controlfile), then media
- recovery can stop prior to the end-of-threads. Redo application for
- a datafile with a finite stop SCN can terminate at that SCN, since it
- is guaranteed that no redo for that datafile beyond that SCN was
- generated.
- As described on 2.15, the stop SCN is set when a datafile goes
- offline. Note that without the optimization that allows recovery of a
- file with a finite stop SCN to terminate at that SCN, it could not be
- guaranteed that recovery of an offline datafile while the database is
- open would terminate.
- 6.11 Automatic Recovery
- Automatic recovery is invoked by using the AUTOMATIC option
- of the media recovery command. It saves the user the trouble of
- entering the names of archived logfiles, provided they are on disk.
- If the sequence number of the log can be determined, then a name
- can be constructed by concatenating the current values of the
- initialization parameters LOG_ARCHIVE_DEST and
- LOG_ARCHIVE_FORMAT. The current LOG_ARCHIVE_DEST
- is assumed, unless the user overrides it by specifying a different
- archiving destination for the recovery session. The media-
- recovery-start checkpoint (see 6.5) contains (in the RBA field) the
- initial log sequence number for one thread (i.e. the thread that
- generated the checkpoint). If multiple threads of redo are enabled,
- the log history section of the controlfile (if configured) can be used
- to map the media-recovery-start SCN to a log sequence number for
- each thread. Once the initial recovery log is found for a thread, all
- subsequent logs needed from the thread follow in order. If it is not
- possible to determine the initial log sequence number, the user will
- have to guess and try logs until the right one is accepted. The
- timestamp from the media-recovery-start checkpoint is reported to
- aid in this effort.
- 6.12 Incomplete Recovery
- A RECOVER DATABASE execution can be stopped and the
- database opened before all the redo has been applied. This type of
- recovery is termed incomplete recovery. The subsequent database
- open is termed a RESETLOGS open.
- Incomplete recovery effectively sets the entire database backwards
- in time to a transaction-consistent state at or near the recovery end-
- point. All subsequent updates to the database are lost and must be
- re-entered.
- Use of incomplete recovery is indicated in the following
- circumstances:
- 7 Media recovery is necessary (e.g. due to datafile damage or
- loss), but cannot be complete (i.e. all redo cannot be applied)
- because all copies of a needed online or archived redo log
- were lost.
- 7 All copies of an active (i.e. needed for instance recovery) log
- were damaged or lost while the database was open. Since
- crash recovery is precluded, this case reduces to the previous
- case.
- 7 It is necessary to reverse the effect of an erroneous user action
- (e.g. table drop or batch run); and it is acceptable to set the
- entire database - not just the affected schema objects -
- backwards to a point-in-time before the error.
- 6.12.1 Incomplete Recovery UNTIL Options
- There are three types of incomplete recovery. They differ in the
- means used to stop the recovery:
- 7 Cancel-Based (RECOVER DATABASE UNTIL CANCEL)
- 7 Change-Based (RECOVER DATABASE UNTIL CHANGE)
- 7 Time-Based (RECOVER DATABASE UNTIL TIME)
- The UNTIL CANCEL option terminates recovery when the user
- enters "cancel" rather than the name of a log. Online logs are not
- automatically applied in this mode in case cancellation at the next
- log is desired. If multiple threads of redo are being recovered, there
- may be logs in other threads that are partially applied when the
- recovery is cancelled.
- The UNTIL CHANGE option terminates redo application just
- before any redo associated with the specified SCN or higher. Thus
- the transaction that committed at that SCN will be rolled back. If
- you want to recover through a transaction that committed at a
- specific SCN, then add one to the specified SCN.
- The UNTIL TIME option works similarly to the UNTIL CHANGE
- option, except that a time rather than an SCN is specified.
- Recovery uses the timestamps in the redo block headers to convert
- the specified time into an SCN. Then recovery is stopped when that
- SCN is reached.
- 6.12.2 Incomplete Recovery and Consistency
- In order to avoid database corruption when running incomplete
- recovery, all datafiles must be recovered to the exact same point.
- Furthermore, no datafile must have any changes in the future of this
- point. This requires that incomplete media recovery must start from
- datafiles restored from backups whose copies completed prior to
- the intended stop time. The system uses file header fuzzy bits (see
- 8.1) to ensure that the datafiles contain no changes in the future of
- the stop time.
- 6.12.3 Incomplete Recovery and Datafiles Known to the Controlfile
- If recovering to a time before a datafile was dropped, the dropped
- file must appear in the controlfile used for recovery. Otherwise it
- would not be recovered. One alternative for achieving this is to
- recover using a backup controlfile made before the datafile was
- dropped. Another alternative is to use the CREATE
- CONTROLFILE command to construct a controlfile that lists the
- dropped datafile.
- Recovering to a time before a file was added is not a problem. The
- extra datafile will be eliminated from the controlfile after the
- database is open. The unwanted file may be taken offline before the
- recovery to avoid accessing it.
- 6.12.4 Resetlogs Open after Incomplete Recovery
- The next database open after an incomplete recovery must specify
- the RESETLOGS option. Amongst other effects (see Section 7),
- resetlogs throws away the redo that was not applied during the
- incomplete recovery, and marks the database so that the skipped
- redo can never be accidentally applied by a subsequent recovery. If
- the incomplete recovery was a mistake (e.g. the lost log was
- found), the next open can specify the NORESETLOGS option.
- However, for the open with NORESETLOGS to succeed, it must
- be preceded by a successful execution of complete recovery (i.e.
- one in which all redo is applied).
- 6.12.5 Files Offline during Incomplete Recovery
- If a file is offline during incomplete recovery, it will not be
- recovered. This is ok if the file is part of a tablespace that was taken
- offline normal, and that is still offline normal at the recovery end-
- point. Otherwise, if the file is still offline when the resetlogs is
- done, the tablespace containing the file will have to be dropped.
- This is because it will need media recovery with logs from before
- the resetlogs. In general V$DATAFILE should be checked to
- ensure that files are online before running an incomplete recovery.
- Only files that will be dropped and files that are part of offline
- normal (or read-only) tablespaces should be offline (Section 8.6).
- 6.13 Backup Controlfile Recovery
- If recovery is done with a controlfile other than the current one,
- then backup controlfile recovery (RECOVER
- DATABASE...USING BACKUP CONTROLFILE) must be used.
- This applies both to the case of a restored controlfile backup, and to
- the case of a "backup" controlfile created via CREATE
- CONTROLFILE...RESETLOGS.
- Use of CREATE CONTROLFILE...RESETLOGS makes a
- controlfile that is a "backup." Only a backup controlfile recovery
- can be run after executing CREATE
- CONTROLFILE...RESETLOGS. Only a RESETLOGS open can
- be used after executing CREATE
- CONTROLFILE...RESETLOGS. Use of CREATE
- CONTROLFILE...RESETLOGS is indicated if (all copies of) an
- online redo log were lost in addition to (all copies of) the control
- file.
- By contrast, CREATE CONTROLFILE...NORESETLOGS makes
- a controlfile that is "current"; i.e. it has knowledge of the current
- state of the online logfiles and log sequence numbers. A backup
- controlfile recovery is not necessary following CREATE
- CONTROLFILE...NORESETLOGS. Indeed, no recovery at all is
- required if there was a clean shutdown, and if no datafile backups
- have been restored. A normal or NORESETLOGS open may
- follow CREATE CONTROLFILE ...NORESETLOGS.
- A backup controlfile lacks valid information about the current
- online logs and datafile stop SCNs. Hence, recovery cannot look
- for online logs to automatically apply. Moreover, recovery must
- assume infinite stop SCN's. A RESETLOGS open corrects this
- information. The backup controlfile may have a different set of
- threads enabled than did the original controlfile. That set will be the
- effective enabled thread set following RESETLOGS open.
- The BACKUP CONTROLFILE option may be used either alone or
- in conjunction with an incomplete recovery option. Unless an
- incomplete recovery option is included, all threads must be applied
- to the end-of-thread. This is validated at open resetlogs time.
- It is currently required that a RESETLOGS open follow execution
- of backup controlfile recovery, even if no incomplete recovery
- option was used. The following procedure could be used to avoid a
- backup controlfile recovery and resetlogs in case the only problem
- is a lost current controlfile (and a backup controlfile exists):
- 1. Copy the backup controlfile to the current control file and do a
- STARTUP MOUNT.
- 2. Issue ALTER DATABASE BACKUP CONTROLFILE TO
- TRACE NORESETLOGS.
- 3. Issue the CREATE CONTROLFILE...NORESETLOGS com-
- mand from the SQL script output by Step 2.
- It is important to assure that the CREATE CONTROLFILE
- command issued in Step 3 creates a controlfile reflecting a database
- structure equivalent to that of the lost current controlfile. For
- example, if a datafile was added since the backup controlfile was
- saved, then the CREATE CONTROLFILE command should be
- modified to declare the added datafile.
- Failure to specify the BACKUP CONTROLFILE option on the
- RECOVER DATABASE command when the controlfile is indeed a
- backup can frequently be detected. One indication of a restored
- backup controlfile would be a datafile header checkpoint count that
- is greater than the checkpoint count in the datafile's controlfile
- record. However, this test may not catch the backup controlfile if
- the datafiles are also backups. Another test validates the online
- logfile headers against their corresponding controlfile records, but
- this too may not always catch an old controlfile.
- 6.14 CREATE DATAFILE: Recover a Datafile Without a Backup
- If a datafile is lost or damaged and no backup of the file is
- available, it can be recovered using only information in the redo
- logs and control file. The following conditions must be met:
- 1. All redo logs written since the datafile was originally created
- must be available.
- 2. A control file in which the datafile is declared (i.e. name and
- size information) must be available or re-creatable.
- The CREATE DATAFILE clause of the ALTER DATABASE
- command is first used to create a new, empty replacement for the
- lost datafile. RECOVER DATAFILE is then used to apply all redo
- generated for the file from the time of its original creation until the
- time it was lost. After all redo logs written since the datafile was
- originally created have been applied, the file will have been
- restored to its state at the time it was lost. This mechanism is useful
- for recovering a recently-created datafile for which no backup has
- yet been taken. The original datafiles of the SYSTEM tablespace
- cannot be recovered by this means, however, since relevant redo
- data is not saved at database creation time.
- 6.15 Point-in-Time Recovery Using Export/Import
- Occasionally, it may become necessary to reverse the effect of an
- erroneous user action (e.g. table drop or batch run). One approach
- would be to perform an incomplete media recovery to a point-in-
- time before the corruption, then open the database with the
- RESETLOGS option. Using this approach, the entire database -
- not just the affected schema objects - would be set backwards in
- time.
- This approach has an undesirable side-effect: it discards committed
- transactions. Any updates that occurred subsequent to the resetlogs
- SCN are lost and must be re-entered. Resetlogs has another
- undesirable side-effect: it renders all pre-existing backups unusable
- for future recovery.
- Setting a mission-critical database globally back in time is often
- not an acceptable solution. The following procedure is an
- alternative whose effect on the mission-critical database is to set
- just the affected schema objects - termed the recovery-objects -
- backwards in time.
- Point-in-time incomplete media recovery is run against a side-copy
- of the production database, called the recovery-database. The
- initial version of the recovery-database is created using backups of
- the production database that were taken before the corruption
- occurred. Non-relevant objects in the recovery-database can be
- taken offline in order to avoid unnecessarily recovering them.
- However, the SYSTEM tablespace and all tablespaces containing
- rollback segments must participate in the media recovery in order
- to allow a clean open. (Note that this is a good reason to place
- rollback segments and data segments into separate tablespaces.)
- After it has undergone point-in-time incomplete media recovery,
- the recovery-database is opened with the RESETLOGS option.
- The recovery-database is now set backwards to a point-in-time
- before the recovery-objects were corrupted. This effectively
- creates pre-corruption versions of the recovery-objects in the
- recovery-database. These objects can then be exported from the
- recovery-database and imported back into the production database.
- Prior to importing the recovery-objects, the production database is
- prepared as follows:
- 7 In the case of recovering an erroneously updated schema
- object, the copy of the object in the production database is pre-
- pared by discarding just the data; e.g. the table is truncated.
- 7 In the case of recovering an erroneously dropped schema
- object, the object is re-created (empty) in the production data-
- base.
- The import operation is then executed, using the data-only option
- as appropriate. Since export/import can be a lengthy process, it
- may be desirable to postpone it until a time when recovery-object
- unavailability can be tolerated. In the meantime, the recovery-
- objects can be made available, albeit at degraded performance, via
- a database link between the production database and the recovery-
- database.
- An undesirable side-effect of this approach is that transaction
- consistency across objects is lost. This side-effect can be avoided
- by widening the recovery-object set to include all objects that must
- be kept transaction-consistent.
- 7 Block Recovery
- Block recovery is the simplest type of recovery. It is performed
- automatically by the system during normal operation of the
- database, and is transparent to the user.
- 7.1 Block Recovery Initiation and Operation
- Block recovery is used to clean up the state of a buffer whose
- modification by a foreground process (in the middle of invoking a
- redo application callback to apply a change vector to the buffer)
- was interrupted by the foreground process dying or signalling an
- error. Recovery involves (i) reading the block from disk; (ii) using
- the current thread's online redo logs to reconstruct the buffer to a
- state consistent with the redo already generated; and (iii) writing
- the recovered block back to disk. If block recovery fails, then after
- a second attempt, the block is marked logically corrupt (by setting
- the block sequence number to zero) and a corrupt block error is
- signalled.
- Block recovery is guaranteed doable using only the current thread's
- online redo logs, since:
- 1. Block recovery cannot require redo from another thread or
- from before the last thread checkpoint.
- 2. Online logs are not reused until the current thread checkpoint
- is beyond the log.
- 3. No buffer currently in the cache can need recovery from
- before the last thread checkpoint.
- 7.2 Buffer Header RBA Fields
- The buffer header (an in-memory data structure) contains the
- following fields pertaining to block recovery:
- Low-RBA and High-RBA: Delineate the range of redo (from the
- current thread) that needs to be applied to the disk version of the
- block in order make it consistent with redo already generated.
- Recovery-RBA: A place marker for recording progress in case the
- invoker of block recovery is PMON and complete recovery in
- one invocation would take too long (see next section).
- 7.3 PMON vs. Foreground Invocation
- If an error is signalled while a foreground process is in a redo
- application callback, then the process itself executes block
- recovery. If foreground process death is detected during a redo
- application callback, on the other hand, PMON executes block
- recovery.
- Block recovery may require an unbounded amount of time and I/O.
- However, PMON cannot be allowed to spend an inordinate amount
- of time working on the recovery of one block while neglecting
- other necessary time-critical tasks. Therefore, a limit is placed on
- the amount of redo applied by one PMON call to block recovery.
- (A port-specific constant specifies the maximum number of redo
- log blocks applied per invocation). As PMON applies redo during
- invocations of block recovery, it updates the recovery-RBA in the
- buffer header to record its progress. When a PMON call to block
- recovery causes the recovery-RBA to reach the high-RBA, then
- block recovery for that block is complete.
- 8 Resetlogs
- The RESETLOGS option is needed on the first database open
- following:
- 7 Incomplete recovery
- 7 Backup controlfile recovery
- 7 CREATE CONTROLFILE...RESETLOGS.
- The primary function of resetlogs is to discard the redo that was not
- applied during incomplete recovery, ensuring that the skipped redo
- can never be accidentally applied by a subsequent recovery. To
- accomplish this, resetlogs effectively invalidates all existing redo
- in all online and archived redo logfiles. This has the side effect of
- making any existing datafile backups unusable for future recovery
- operations.
- Resetlogs also reinitializes the controlfile information about online
- logs and redo threads, clears the contents of any existing online
- redo log files, creates the online redo log files if they do not
- currently exist, and resets the log sequence number in all threads to
- one.
- 8.1 Fuzzy Files
- The most important requirement when doing a RESETLOGS open
- is that all datafiles be validated as recovered to the same point-in-
- time. This is what ensures that all the changes in a single redo
- record are done atomically. It is also important for other
- consistency reasons. If all threads of redo have been applied
- through end-of-thread to all online datafiles, then we can be sure
- that the database is consistent.
- If incomplete recovery was done, there is the possibility that a file
- was not restored from a sufficiently old backup. In the general case,
- this is detectable if the file has a different checkpoint than the other
- files (exceptions: offline or read-only files).
- The other possibility is that the file is fuzzy - i.e. it may contain
- changes in the future of its checkpoint. As seen earlier, the
- following "fuzzy bits" are maintained in the file header to
- determine if a file is fuzzy:
- 7 online-fuzzy bit (see 3.5, 6.7.2)
- 7 hotbackup-fuzzy bit (see 4, 6.7.3)
- 7 media-recovery-fuzzy bit (see 6.7.1)
- Open with resetlogs following incomplete media recovery will fail
- if any online datafile has any of the three fuzzy bits set.
- Redo records are created at the end of a hot backup (the end-
- backup "marker") and after crash recovery (the end-crash-recovery
- "marker") to enable media recovery to determine when it can clear
- the fuzzy bits. Resetlogs signals an error if any of the datafiles has
- any of the fuzzy bits set.
- Except in the following special circumstances, resetlogs signals an
- error if any of the datafiles is recovered to a checkpoint SCN
- different from the one at which the other files are checkpointed (i.e.
- the resetlogs SCN: see 8.2):
- 1. A file recovered to an SCN earlier than the resetlogs SCN
- would be tolerated in case there were no redo generated for the
- file between its checkpoint SCN and the resetlogs SCN. For
- example, such would be the case if the file were read-only, and
- its offline range spanned the checkpoint SCN and resetlogs
- SCN. In this case, resetlogs would allow the file but set it
- offline.
- 2. A file checkpointed at an SCN later than the resetlogs SCN
- would be tolerated in case its creation SCN (allocated at file
- creation time and stored in the file header) showed it to have
- been created after the resetlogs SCN. During the data dictio-
- nary vs. controlfile check performed by RESETLOGS open
- (see 8.7), such a file would be found to be missing from the
- data dictionary but present in the controlfile. As a conse-
- quence, it would be eliminated from the controlfile.
- 8.2 Resetlogs SCN and Counter
- A resetlogs SCN and resetlogs timestamp - known together as the
- resetlogs data - are kept in the database info record of the
- controlfile. The resetlogs data is intended to uniquely identify each
- execution of a RESETLOGS open. The resetlogs data is also stored
- in each datafile header and in each logfile header. A redo log cannot
- be applied by recovery if its resetlogs data does not match that in
- the database info record of the controlfile. Except for some very
- special circumstances (e.g. offline normal or read-only
- tablespaces), a datafile cannot be recovered or accessed if its
- resetlogs data does not match that of the database info record of the
- controlfile. This ensures that changes discarded by resetlogs do not
- get back into the database. It also renders previous backups
- unusable for future recovery operations, making it prudent to take a
- database backup immediately after a resetlogs.
- 8.3 Effect of Resetlogs on Threads
- Each thread's controlfile record is updated to clear the thread-open
- flag and to set the thread-checkpoint SCN to the resetlogs SCN.
- Thus, the thread appears to have been closed at the resetlogs SCN.
- The set of enabled threads from the enabled thread bitvec of the
- database info controlfile record is used as is. It does not matter
- which threads were enabled at the end of recovery, since none of
- the old redo can ever be applied to the database again. The log
- sequence numbers in all threads are also reset to one. One of the
- enabled threads is picked as the database checkpoint.
- 8.4 Effect of Resetlogs on Redo Logs
- The redo is thrown away by zeroing all the online logs. Note that
- this means that redo in the online logs would be lost forever - and
- there would be no way to undo the resetlogs in an emergency - if
- the online logs were not backed up prior to executing resetlogs.
- Note that ensuring the ability to undo an erroneous resetlogs is the
- only valid rationale for making backups of online logs. Undoing an
- erroneous resetlogs requires re-running the entire recovery
- operation from the beginning, after restoring backups of all
- datafiles, controlfile, and online logs.
- One log is picked to be the current log for every enabled thread.
- That log header is written as log sequence number one. Note that
- the set of logs and their thread association is picked up from the
- controlfile (i.e. using the thread number and log list fields of the
- logfile records). If it is a backup controlfile, this may be different
- from what was current the last time the database was open.
- 8.5 Effect of Resetlogs on Online Datafiles
- The headers of all the online datafiles are updated to be
- checkpointed at the new database checkpoint. The new resetlogs
- data is also written to the header.
- 8.6 Effect of Resetlogs on Offline Datafiles
- The controlfile record for an offline file is set to indicate the file
- needs media recovery. However that will not be possible because it
- would be necessary to apply redo from logs with the wrong
- resetlogs data. This means that the tablespace containing the file
- will have to be dropped. There is one important exception to this
- rule. When a tablespace is taken offline normal or set read-only, the
- checkpoint SCN written to the headers of the tablespace's
- constituent datafiles is saved in the data dictionary TS$ table as the
- tablespace-clean-stop SCN (see 2.17). No recovery is ever needed
- to bring a tablespace and its files online if the files are not fuzzy
- and are checkpointed at exactly the tablespace-clean-stop SCN.
- Even the resetlogs data in the offline file header is ignored in this
- case. Thus a tablespace that is offline normal is unaffected by any
- resetlogs that leaves the database at a time when the tablespace is
- offline.
- 8.7 Checking Dictionary vs. Controlfile on Resetlogs Open
- After the rollback phase of RESETLOGS open, the datafiles listed
- in the data dictionary FILE$ table are compared with the datafiles
- listed in the controlfile. This is also done on the first open after a
- CREATE CONTROLFILE. There is the possibility that incomplete
- recovery ended at a time when the files in the database were
- different from those in the controlfile used for the recovery. Using a
- backup controlfile or creating one can have the same problem.
- Checking the dictionary does not do any harm, so it could be done
- on every database open; however there is no point in wasting the
- time under normal circumstances.
- The entry in FILE$ is compared with the entry in the controlfile
- for every file number. Since FILE$ reflects the space allocation
- information in the database, it is correct, and the controlfile might
- be wrong. If the file does not exist in FILE$ but the controlfile
- record says the file exists, then the file is simply dropped from the
- controlfile.
- If a file exists in FILE$ but not in the controlfile, a placeholder
- entry is created in the control file under the name MISSINGnnnn
- (where nnnn is the file number in decimal). MISSINGnnnn is
- flagged in the control file as being offline and needing media
- recovery. The actual file corresponding (with respect to the file
- header contents as opposed to the file name) to MISSINGnnnn can
- be made accessible by renaming MISSINGnnnn to point to it.
- In the RESETLOGS open case however, rename can succeed in
- making the file usable only in case the file was read-only or offline
- normal. If, on the other hand, MISSINGnnnn corresponds to a file
- that was not read-only or offline normal, then the rename operation
- cannot be used to make it accessible, since bringing it online would
- require media recovery with redo from before the resetlogs. In this
- case, the tablespace containing the datafile must be dropped.
- When the dictionary check is due to open after CREATE
- CONTROLFILE...NORESETLOGS rather than to open resetlogs,
- media recovery may be used to make the file current.
- Another option is to repeat the entire operation that lead up to the
- dictionary check with a controlfile that lists the same datafiles as
- the data dictionary. For incomplete recovery, this would involve
- restoring all backups and repeating the recovery.
- 9 Recovery-Related V$ Fixed-Views
- The V$ fixed-views contain columns that extract information from
- data structures dynamically maintained in memory by the kernel.
- These "views" make this information accessible to the DBA under
- SYS. The following is a summary of recovery-related information
- that is viewable via V$ views:
- 9.1 V$LOG
- Contains log group information from the controlfile:
- GROUP#
- THREAD#
- SEQUENCE#
- SIZE_IN_BYTES
- MEMBERS_IN_GROUP
- ARCHIVED_FLAG
- STATUS_OF_ GROUP (unused, current, active, inactive)
- LOW_SCN
- LOW_SCN_TIME
- 9.2 V$LOGFILE
- Contains log file (i.e. group member) information from the
- controlfile:
- GROUP#
- STATUS_OF_MEMBER (invalid, stale, deleted)
- NAME_OF_MEMBER
- 9.3 V$LOG_HISTORY
- Contains log history information from the controlfile:
- THREAD#
- SEQUENCE#
- LOW_SCN
- LOW_SCN_TIME
- NEXT_SCN
- 9.4 V$RECOVERY_LOG
- Contains information (from the controlfile log history) about
- archived logs needed to complete media recovery.:
- THREAD#
- SEQUENCE#
- LOW_SCN_TIME
- ARCHIVED_NAME
- 9.5 V$RECOVER_FILE
- Contains information on the status of files needing media recovery:
- FILE#
- ONLINE_FLAG
- REASON_MEDIA_RECOVERY_NEEDED
- RECOVERY_START_SCN
- RECOVERY_START_SCN_TIME
- 9.6 V$BACKUP
- Contains status information relative to datafiles in hot backup:
- FILE#
- FILE_STATUS (no-backup-active, backup-active, offline-normal,
- error)
- BEGIN_BACKUP_SCN
- BEGIN_BACKUP_TIME
- 10 Miscellaneous Recovery Features
- 10.1 Parallel Recovery (v7.1)
- The goal of the parallel recovery feature is to use compute and I/O
- parallelism to reduce the elapsed time required to perform crash
- recovery, single-instance recovery, or media recovery. Parallel
- recovery is most effective at reducing recovery time when several
- datafiles on several disks are being recovered concurrently.
- 10.1.1 Parallel Recovery Architecture
- Parallel recovery partitions recovery processing into two
- operations:
- 1. Reading the redo log.
- 2. Applying the change vectors.
- Operation #1 does not easily lend itself to parallelization. The redo
- log(s) must be read in sequentially, and merged in the case of
- media recover. Thus, this task is assigned to one process: the
- redo-reading-process.
- Operation #2, on the other hand, easily lends itself to
- parallelization. Thus, the task of change vector application is
- delegated to some number of redo-application-slave-processes.
- The redo-reading-process sends change vectors to the redo-
- application-slave-processes using the same IPC (inter-process-
- communication) mechanism used by parallel query. The change
- vectors are distributed based on the hash function that takes the
- block address as argument (i.e. DBA modulo # redo-application-
- slave-processes). Thus, each redo-application-slave-process
- handles only change vectors for blocks whose DBAs hash to its
- "bucket" number. The redo-application-slave-processes are
- responsible for reading the datablocks into cache, checking
- whether or not the change vectors need to be applied, and applying
- the change vectors if needed.
- This architecture achieves parallelism in log read I/O, datablock
- read I/O, and change vector processing. It allows overlap of log
- read I/Os with datablock read I/Os. Moreover, it allows overlap of
- datablock read I/Os for different hash "buckets." Recovery elapsed
- time is reduced as long as the benefits of compute and I/O
- parallelism outweigh the costs of process management and inter-
- process-communication.
- 10.1.2 Parallel Recovery System Initialization Parameters
- PARALLEL_RECOVERY_MAX_THREADS
- PARALLEL_RECOVERY_MIN_THREADS
- These initialization parameters control the number of redo-
- application-slave-processes used during crash recovery or
- media recovery of all datafiles.
- PARALLEL_INSTANCE_RECOVERY_THREADS
- This initialization parameter controls the number of redo-appli-
- cation-slave-processes used during instance recovery.
- 10.1.3 Media Recovery Command Syntax Changes
- RECOVER DATABASE has a new optional parameter for specify-
- ing the number of redo-application-slave-processes. If specified,
- it overrides PARALLEL_RECOVERY_MAX_THREADS.
- RECOVER TABLESPACE has a new optional parameter for spec-
- ifying the number of redo-application-slave-processes. If speci-
- fied, it overrides PARALLEL_RECOVERY_MIN_THREADS.
- RECOVER DATAFILE has a new optional parameter for specify-
- ing the number of redo-application-slave-processes. If specified,
- it overrides PARALLEL_RECOVERY_MIN_THREADS.
- 10.2 Redo Log Checksums (v7.2)
- The log checksum feature allows a potential corruption in an online
- redo log to be detected when the log is read for archiving. The goal
- is to prevent the corruption from being propagated, undetected, to
- the archive log copy. This feature is intended to be used in
- conjunction with a new command, CLEAR LOGFILE, that allows
- a corrupted online redo log to be discarded without having to
- archive it.
- A new initialization parameter, LOG_BLOCK_CHECKSUM,
- controls activation of log checksums. If it is set, a log block
- checksum is computed and placed in the header of each log block
- as it is written out of the redo log buffer. If present, checksums are
- validated whenever log blocks are read for archiving or recovery. If
- a checksum is detected as invalid, an attempt is made to read
- another member of the log group (if any). If an irrecoverable
- checksum error is detected - i.e. the checksum is invalid in all
- members - then the log read operation fails.
- Note that a rudimentary mechanism for detecting log block header
- corruption was added, along with log group support, in v7.1. The
- log checksum feature extends corruption detection to the whole
- block.
- If an irrecoverable checksum error prevents a log from being read
- for archiving, then the log cannot be reused. Eventually log switch
- - and redo generation - will stall. If no action is taken, the
- database will hang. The CLEAR LOGFILE command provides a
- way to obviate the requirement that the log be archived before it
- can be reused.
- 10.3 Clear Logfile (v7.2)
- If all members of an online redo log group are "lost" or "corrupted"
- (e.g. due to checksum error, media error, etc.), redo generation may
- proceed normally until it becomes necessary to reuse the logfile.
- Once the thread checkpoints of all threads are beyond the log, it is a
- potential candidate for reuse. Possible scenarios preventing reuse
- are the following:
- 1. The log cannot be archived due to a checksum error; it cannot
- be reused because it needs archiving.
- 2. A log switch attempt fails because the log is inaccessible (e.g.
- due to a media error). The log may or may not have been
- archived.
- The ALTER DATABASE CLEAR LOGFILE command is
- provided as an aid to recovering from such scenarios involving an
- inactive online redo log group (i.e. one that is not needed for crash
- recovery). CLEAR LOGFILE allows an inactive online logfile to
- be "cleared": i.e. discarded and reinitialized, in a manner analogous
- to DROP LOGFILE followed by ADD LOGFILE. In many cases,
- use of this command obviates the need for database shutdown or
- resetlogs.
- Note: CLEAR LOGFILE cannot be used to clear a log needed for
- crash recovery (i.e. a "current" or "active" log of an open thread).
- Instead, if such a log becomes lost or corrupted, shutdown abort
- followed by incomplete recovery and open resetlogs will be
- necessary.
- Use of the UNARCHIVED option allows the log clear operation to
- proceed even if the log needs archiving: an operation that would be
- disallowed by DROP LOGFILE. Furthermore, CLEAR LOGFILE
- allows the log clear operation to proceed in the following cases:
- 7 There are only two logfile groups in the thread.
- 7 All log group members have been lost through media failure.
- 7 The logfile being cleared is the current log of a closed thread.
- All of these operations would be disallowed in the case of DROP
- LOGFILE.
- Clearing an unarchived log makes unusable any existing backup
- whose recovery would require applying redo from the cleared log.
- Therefore, it is recommended that the database be immediately
- backed up following use of CLEAR LOGFILE with the
- UNARCHIVED option. Furthermore, the UNRECOVERABLE
- DATAFILE option must be used if there is a datafile that is offline,
- and whose recovery prior to onlining requires application of redo
- from the cleared logfile. Following use of CLEAR LOGFILE with
- the UNRECOVERABLE DATAFILE option, the offline datafile,
- together with its entire tablespace, will have to be dropped from the
- database. This is due to the fact that redo necessary to bring it
- online has been cleared, and there is no other copy of it.
- The foreground process executing CLEAR LOGFILE processes
- the command in several steps:
- 7 It checks that the logfile is not needed for crash recovery and
- is clearable.
- 7 It sets the "being cleared" and "archiving not needed" flags in
- the logfile controlfile record. While the "being cleared" flag is
- set, the logfile is ineligible for reuse by log switch.
- 7 It recreates a new logfile, and performs multiple writes to clear
- it to zeroes (a lengthy process).
- 7 It resets the "being cleared" flag.
- If the foreground process executing CLEAR LOGFILE dies while
- execution is in process, the log will not be usable as the current log.
- Redo generation may stall and the database may hang, much as
- would happen if log switch had to wait for checkpoint completion,
- or for log archive completion. Should the process executing
- CLEAR LOGFILE die, the operation should be completed by
- reissuing the same command. Another option would be to drop the
- partially-cleared log. CLEAR LOGFILE could also fail due to an I/
- O error encountered while writing zeros to a log group member. An
- option for recovering would be to drop that member and add
- another to replace it.
Add Comment
Please, Sign In to add comment