CMU Database - 15445 - 2025 Spring

cmu_database

This collection of documents, "CMU Database - 15445 - 2025 Spring," provides a comprehensive overview of database systems, primarily focusing on the design, implementation, and ...

Documents Knowledge Graph

lecture-20-slides.pdf

Carnegie Mellon University

Database Systems

Database Logging

ADMINISTRIVIA

Project #4 is due Sunday April 20^th @ 11

$\longrightarrow$ Recitation: Friday, April 11^th in GHC 4303 from 3

- 4

Final Exam is on Monday, April 28, 2025, from 05

- 08

$\longrightarrow$ Early exam will not be offered. Do not make travel plans.

No OH on April 7^th (I have to teach another class)

UPCOMING DATABASE TALKS

0xide (DB Seminar) $\rightarrow$ Monday, April 7 @ 4

$\rightarrow$ OxQL: Oximeter Query Language $\rightarrow$ Speaker: Ben Naecker $\rightarrow$ https://cmu.zoom.us/j/93441451665

Oxide

LAST CLASS

We discussed multi- version concurrency control (MVCC) and how it effects the design of the entire DBMS architecture.

A DBMS's concurrency control protocol gives it Atomicity + Consistency + Isolation.

We now need ensure Atomicity + Durability...

MOTIVATION

Schedule

T1 BEGIN R(A) W(A) COMMIT

Buffer Pool

MOTIVATION

Schedule

MOTIVATION

Schedule

MOTIVATION

Schedule

MOTIVATION

[ImageCaption: Schedule]

MOTIVATION

Schedule

CRASH RECOVERY

Recovery algorithms are techniques to ensure database consistency, transaction atomicity, and durability despite failures.

Recovery algorithms have two parts:

$\rightarrow$ Actions during normal txn processing to ensure that the DBMS can recover from a failure. $\rightarrow$ Actions after a failure to recover the database to a state that ensures atomicity, consistency, and durability.

CRASH RECOVERY

Recovery algorithms are techniques to ensure database consistency, transaction atomicity, and durability despite failures.

Recovery algorithms have two parts:

$\rightarrow$ Actions during normal txn processing to ensure that the DBMS can recover from a failure.

Today

$\rightarrow$ Actions after a failure to recover the database to a state that ensures atomicity, consistency, and durability.

TODAY'S AGENDA

Buffer Pool PoliciesShadow PagingWrite- Ahead LogLogging SchemesCheckpointsDB Flash Talk: Firebolt

OBSERVATION

The database's primary storage location is on non- volatile storage, but this is slower than volatile storage. Use volatile memory for faster access: $\rightarrow$ First copy target record into memory. $\rightarrow$ Perform the writes in memory. $\rightarrow$ Write dirty records back to disk.

The DBMS needs to ensure the following:

$\rightarrow$ The changes for any txn are durable once the DBMS has told somebody that it committed. $\rightarrow$ No partial changes are durable if the txn aborted.

UNDO VS. REDO

Undo: The process of removing the effects of an incomplete or aborted txn. Redo: The process of re- applying the effects of a committed txn for durability.

How the DBMS supports this functionality depends on how it manages the buffer pool ...

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

BUFFER POOL

Schedule

STEAL POLICY

Whether the DBMS can evict a dirty object in the buffer pool modified by an uncommitted txn and overwrite the most recent committed version of that object in non- volatile storage.

STEAL: Eviction + overwriting is allowed. NO- STEAL: Eviction + overwriting is not allowed.

FORCE POLICY

Whether the DBMS requires that all updates made by a txn are written back to non- volatile storage before the txn can commit.

FORCE: Write- back is required. NO- FORCE: Write- back is not required.

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

[ImageCaption: Schedule]

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

Schedule

NO-STEAL + FORCE

This approach is the easiest to implement:

$\rightarrow$ Never have to undo changes of an aborted txn because the changes were not written to disk. $\rightarrow$ Never have to redo changes of a committed txn because all the changes are guaranteed to be written to disk at commit time (assuming atomic hardware writes).

Previous example cannot support write sets that exceed the amount of physical memory available.

SHADOW PAGING

Instead of copying the entire database, the DBMS copies pages on write to create two versions: $\rightarrow$ Master: Contains only changes from committed txns. $\rightarrow$ Shadow: Temporary database with changes made from uncommitted txns.

To install updates when a txn commits, overwrite the root so it points to the shadow, thereby swapping the master and shadow.

Buffer Pool Policy: NO- STEAL + FORCE

SHADOW PAGING - EXAMPLE

SHADOW PAGING - UNDO/REDO

Supporting rollbacks and recovery is easy with shadow paging.

Undo: Remove the shadow pages. Leave the master and the DB root pointer alone.

Redo: Not needed at all.

SHADOW PAGING – DISADVANTAGES

Copying the entire page table is expensive:

$\rightarrow$ Use a page table structured like a B+tree (LMDB). $\rightarrow$ No need to copy entire tree, only need to copy paths in the tree that lead to updated leaf nodes.

Commit overhead is high:

$\rightarrow$ Flush every updated page, page table, and root. $\rightarrow$ Data gets fragmented (bad for sequential scans). $\rightarrow$ Need garbage collection. $\rightarrow$ Only supports one writer txn at a time or txns in a batch.

SQLITE (PRE-2010)

When a txn modifies a page, the DBMS copies the original page to a separate journal file before overwriting master version. $\rightarrow$ Called rollback mode.