CMU Database - 15445 - 2025 Spring

cmu_database

This collection of documents, "CMU Database - 15445 - 2025 Spring," provides a comprehensive overview of database systems, primarily focusing on the design, implementation, and ...

Documents Knowledge Graph

lecture-18-slides.pdf

Carnegie Mellon University

Database

Systems

Optimistic

Concurrency Control

LAST CLASS

We discussed concurrency control protocols for generating conflict serializable schedules without needing to know what queries a txn will execute.

The two- phase locking (2PL) protocol requires txns to acquire locks on database objects before they are allowed to access them.

OBSERVATION

If you assume that conflicts between txns are rare and that most txns are short- lived, then forcing txns to acquire locks adds unnecessary overhead.

A better concurrency control protocol could be one that is optimized for the no- conflict case...

T/O CONCURRENCY CONTROL

Use timestamps to determine the serializability order of txns.

If $\mathsf{TS}(\mathsf{T}{\mathrm{i}})< \mathsf{TS}(\mathsf{T}{\mathrm{j}})$ , then the DBMS must ensure that the execution schedule is equivalent to the serial schedule where $\mathsf{T}{\mathrm{i}}$ appears before $\mathsf{T}{\mathrm{j}}$ .

Each database object (e.g., tuple) will include additional fields to keep track of timestamp(s) of the txns that last accessed/modified them.

TIMESTAMP ALLOCATION

Each txn $\mathsf{T}_{\mathbf{i}}$ is assigned a unique fixed timestamp that is monotonically increasing.

$\rightarrow$ Let $\mathsf{TS}(\mathsf{T}{\mathbf{i}})$ be the timestamp allocated to txn $\mathsf{T}{\mathbf{i}}$ . $\rightarrow$ Different schemes assign timestamps at different times during the txn.

Multiple implementation strategies:

$\rightarrow$ System/Wall Clock. $\rightarrow$ Logical Counter. $\rightarrow$ Hybrid.

TODAY'S AGENDA

Optimistic Concurrency ControlPhantom ReadsIsolation LevelsDB Flash Talk: StarTree

OPTIMISTIC CONCURRENCY CONTROL (OCC)

T/O protocol where DBMS creates a private workspace for each txn. $\longrightarrow$ Any object read is copied into workspace. $\longrightarrow$ Modifications are applied to workspace.

When a txn commits, the DBMS compares workspace write set to see whether it conflicts with other txns.

If there are no conflicts, the write set is installed into the "global" database.

On Optimistic Methods for Concurrency Control

H.T. KUNG and JOHN T. ROBINSON Carnegie- Mellon University

Most current approaches to concurrency control in database systems rely on locking of data objects as a control mechanism. In this paper, two families of nonlocking concurrency controls are presented. The methods used are "optimistic" in the sense that they rely mainly on transaction backup as a control mechanism, "hoping" that conflicts between transactions will occur. Applications for which these methods should be more difficult than locking are discussed.

Key Words and Phrases: databases, concurrency controls, transaction processing CR Categories: 4.32, 4.33

1. INTRODUCTION

Consider the problem of providing shared access to a database organized as a collection of objects. We assume that certain distinguished objects, called the roots, are always present and access to any object other than a root is gained only by first accessing a root and then following pointers to that object. Any sequence of accesses to the database then preserves the integrity constraints of the data is called a transaction (see, e.g., 44).

If our goal is to maximize the throughput of accesses to the database, then there are at least two cases where highly concurrent access is desirable.

(1) The amount of data is sufficiently great that at any given time only a fraction of the database can be present in primary memory, so that it is necessary to swap parts of the database from secondary memory as needed. (2) Even if the entire database can be present in primary memory, there may be multiple processors.

In both cases the hardware will be underutilized if the degree of concurrency is too low.

However, as is well known, unrestricted concurrent access to a shared database will, in general, cause the integrity of the database to be lost. Most current

OCC PHASES

Phase #1 - Read

$\rightarrow$ Track the read/write sets of txns and store their writes in a private workspace. $\rightarrow$ DBMS copies every tuple that the txn accesses from the shared database to its workspace ensure repeatable reads.

Phase #2 - Validation

$\rightarrow$ Assign the txn a unique timestamp (TS) and then check whether it conflicts with other txns.

Phase #3 - Write

$\rightarrow$ If validation succeeds, set the write timestamp (W- TS) to all modified objects in private workspace and install them into the global database. Otherwise abort txn.

OCC PHASES

Phase #1 - Read

Phase #2 - Validation

$\rightarrow$ Assign the txn a unique timestamp (TS) and then check whether it conflicts with other txns.

Phase #3 - Write

$\rightarrow$ If validation succeeds, set the write timestamp (W- TS) to all modified objects in private workspace and install them into the global database. Otherwise abort txn.

OCC EXAMPLE

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC EXAMPLE

Schedule

Database

OCC EXAMPLE

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC EXAMPLE

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Table (html):

Object Value	W-TS
-	-
-	-

OCC EXAMPLE

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Table (html):

Object Value	W-TS
-	-
-	-

OCC EXAMPLE

Schedule

Database

OCC EXAMPLE

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC EXAMPLE

Schedule

Database

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC EXAMPLE

Schedule

OCC: READ PHASE

Track the read/write sets of txns and store their writes in a private workspace.

The DBMS copies every tuple that the txn accesses from the shared database to its workspace ensure repeatable reads. $\rightarrow$ We can ignore for now what happens if a txn reads/writes tuples via indexes.

OCC: VALIDATION PHASE

When txn $\mathsf{T}_{\mathbf{i}}$ invokes COMMIT, the DBMS checks if it conflicts with other txns.

$\rightarrow$ Original OCC algorithm uses serial validation. $\rightarrow$ Parallel validation requires each txn check read/write sets of other txns trying to validate at the same time.

DBMS needs to guarantee only serializable schedules are permitted. $\rightarrow$ Approach #1: Backward Validation $\rightarrow$ Approach #2: Forward Validation

OCC: VALIDATION PHASE

Forward Validation: Check whether the committing txn intersects its read/write sets with any active txns that have not yet committed.

Backward Validation: Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: VALIDATION PHASE

Forward Validation: Check whether the committing txn intersects its read/write sets with any active txns that have not yet committed.

Backward Validation: Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: FORWARD VALIDATION

Each txn's timestamp is assigned at the beginning of the validation phase.

Check the timestamp ordering of the committing txn with all other active txns.

If $\mathsf{TS}(\mathsf{T}_1)< \mathsf{TS}(\mathsf{T}_2)$ , then one of the following three conditions must hold...

OCC: FORWARD VALIDATION CASE #1

Schedule

If $(T_{1}< T_{2})$ , check if T, completes its Write phase before T2 begins its Read phase.

No conflict as all T's actions happen before T2's. $\rightarrow$ This just means that there is serial ordering.

OCC: FORWARD VALIDATION CASE #2

If $(T_{1}< T_{2})$ , check if ${\sf T}{1}$ completes its Write phase before ${\sf T}{2}$ starts its Write phase and ${\sf T}{1}$ does not modify to any object read by ${\sf T}{2}$ $\rightarrow$ WriteSet $(T_{1})\cap \mathsf{R e a d S e t}(T_{2}) = \emptyset$

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Object Value W- TS

Table (html):

-	-	-
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Table (html):

Object Value	W-TS
A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Object Value W- TS

Table (html):

A	456	∞
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T Workspace

Object Value W- TS

Table (html):

A	456	∞
-	-	-

T Workspace

Object Value W- TS

Table (html):

A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

T₁ Workspace

Table (html):

Object Value	W-TS
-	-
-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
-	-	-

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

OCC: FORWARD VALIDATION CASE #2

Schedule

Database

OCC: FORWARD VALIDATION CASE #2

Schedule

OCC: FORWARD VALIDATION CASE #3

If $(T_{1}< T_{2})$ , check if ${\sf T}{1}$ completes its Read phase before ${\sf T}{2}$ completes its Read phase and ${\sf T}{1}$ does not modify any object either read or written by ${\sf T}{2}$ .. $\begin{array}{rl} & {\rightarrow \mathsf{W r i t e S e t}(\mathsf{T}{1})\cap \mathsf{R e a d S e t}(\mathsf{T}{2}) = \emptyset}\ & {\rightarrow \mathsf{W r i t e S e t}(\mathsf{T}{1})\cap \mathsf{W r i t e S e t}(\mathsf{T}{2}) = \emptyset} \end{array}$

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
B	XYZ	0

T Workspace

Object Value W- TS

Table (html):

A	456	∞
<fcel>	<ecel>

Table (html):

Object Value	W-TS
B	XYZ	0
-	-	-

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
B	XYZ	0

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

Table (html):

Object	Value	W-TS
A	123	0
B	XYZ	0

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

Table (html):

Object	Value	W-TS
A	456	1
B	XYZ	0

T2 Workspace

Table (html):

Object Value	W-TS
B	XYZ	0
-	-	-

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

OCC: FORWARD VALIDATION CASE #3

Schedule

Database

OCC: FORWARD VALIDATION

Check whether the committing txn intersects its read/write sets with any active txns that have not yet committed.

OCC: FORWARD VALIDATION

Check whether the committing txn intersects its read/write sets with any active txns that have not yet committed.

OCC: FORWARD VALIDATION

Check whether the committing txn intersects its read/write sets with any active txns that have not yet committed.

OCC: BACKWARD VALIDATION

Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: BACKWARD VALIDATION

Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: BACKWARD VALIDATION

Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: BACKWARD VALIDATION

Check whether the committing txn intersects its read/write sets with those of any txns that have already committed.

OCC: WRITE PHASE

Propagate changes in the txn's write set to database to make them visible to other txns.

Serial Commits:

$\rightarrow$ Use a global latch to limit a single txn to be in the Validation/Write phases at a time.

Parallel Commits:

$\rightarrow$ Use fine- grained write latches to support parallel Validation/Write phases. $\rightarrow$ Txns acquire latches in a sequential key order to avoid deadlocks.

OCC: OBSERVATIONS

OCC works well when the # of conflicts is low: $\rightarrow$ All txns are read- only (ideal). $\rightarrow$ Txns access disjoint subsets of data.

OCC: OBSERVATIONS

OCC works well when the # of conflicts is low: $\rightarrow$ All txns are read- only (ideal). $\rightarrow$ Txns access disjoint subsets of data.

But OCC has its own problems: $\rightarrow$ High overhead for copying data locally. $\rightarrow$ Validation/Write phase bottlenecks. $\rightarrow$ Aborts are more wasteful than in 2PL because they only occur after a txn has already executed.

OBSERVATION

We have only dealt with transactions that read and update existing objects in the database.

But now if txns perform insertions, updates, and deletions, we have new problems...

THE PHANTOM PROBLEM

Schedule