Carnegie Mellon University
Database Systems Two- Phase Locking
ADMINISTRIVIA
Project #3 is due Sunday March 30th @ 11
$\rightarrow$ Recitation: slides, recording
UPCOMING DATABASE TALKS
Monday @ 4
on
https://cmu.zoom.us/j/93441451665
LAST CLASS
Conflict Serializable
$\rightarrow$ Verify using either the "swapping" method or dependency graphs. $\rightarrow$ Any DBMS that says that they support "serializable" isolation does this.
View Serializable
$\rightarrow$ No efficient way to verify. $\rightarrow$ No DBMS that supports this.
OBSERVATION
We need a way to guarantee that all execution schedules are correct (i.e., serializable) without knowing the entire schedule ahead of time.
Solution: Use locks to protect database objects.
LOCKS VS. LATCHES
Table (html):
| Locks | Latches |
| Separate... | Transactions | Workers (threads, processes) |
| Protect... | Database Contents | In-Memory Data Structures |
| During... | Entire Transactions | Critical Sections |
| Modes... | Shared, Exclusive, Update, Intention | Read, Write |
| Deadlock | Detection & Resolution | Avoidance |
| ...by... | Waits-for, Timeout, Aborts | Coding Discipline |
| Kept in... | Lock Manager | Protected Data Structure |
LOCKS VS. LATCHES
Table (html):
| Locks | Latches |
| Separate... | Transactions | Workers (threads, processes) |
| Protect... | Database Contents | In-Memory Data Structures |
| During... | Entire Transactions | Critical Sections |
| Modes... | Shared, Exclusive, Update, Intention | Read, Write |
| Deadlock
...by... | Detection & Resolution | Avoidance |
| Waits-for, Timeout, Aborts | Coding Discipline |
| Kept in... | Lock Manager | Protected Data Structure |
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
TODAY'S AGENDA
Lock TypesTwo- Phase LockingDeadlock Detection + PreventionHierarchical Locking
BASIC LOCK TYPES
S- LOCK: Shared locks for reads. X- LOCK: Exclusive locks for writes.
Table (html):
| Shared S-LOCK | Exclusive X-LOCK |
| Shared S-LOCK | ✓ | × |
| Exclusive X-LOCK | × | × |
[TableCaption: Compatibility Matrix]
EXECUTING WITH LOCKS
Transactions request locks (or upgrades). Lock manager grants or blocks requests. Transactions release locks. Lock manager updates its internal lock- table. $\rightarrow$ It keeps track of what transactions hold what locks and what transactions are waiting to acquire any locks.
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
EXECUTING WITH LOCKS
Schedule
Lock Manager
Granted $(T_{1} \Rightarrow A)$
Released $(T_{1} \Rightarrow A)$
Granted $(T_{2} \Rightarrow A)$
Released $(T_{2} \Rightarrow A)$
Granted $(T_{1} \Rightarrow A)$
Released $(T_{1} \Rightarrow A)$
CONCURRENCY CONTROL PROTOCOL
Two- phase locking (2PL) is a concurrency control protocol that determines whether a txn can access an object in the database at runtime.
The protocol does not need to know all the queries that a txn will execute ahead of time.
TWO-PHASE LOCKING
Phase #1: Growing
$\rightarrow$ Each txn requests the locks that it needs from the DBMS's lock manager. $\rightarrow$ The lock manager grants/denies lock requests.
Phase #2: Shrinking
$\rightarrow$ The txn is allowed to only release/downgrade locks that it previously acquired. It cannot acquire new locks.
TWO-PHASE LOCKING
The txn is not allowed to acquire/upgrade locks after the growing phase finishes.
[ImageCaption: Transaction Lifetime]
TWO-PHASE LOCKING
The txn is not allowed to acquire/upgrade locks after the growing phase finishes.
[ImageCaption: Transaction Lifetime]
TWO-PHASE LOCKING
The txn is not allowed to acquire/upgrade locks after the growing phase finishes.
EXECUTING WITH 2PL
Schedule
Lock Manager
EXECUTING WITH 2PL
Schedule
Lock Manager
EXECUTING WITH 2PL
Schedule
Lock Manager
EXECUTING WITH 2PL
Schedule
Lock Manager
EXECUTING WITH 2PL
Schedule
Lock Manager
TWO-PHASE LOCKING
2PL on its own is sufficient to guarantee conflict serializability because it generates schedules whose precedence graph is acyclic.
But it is subject to cascading aborts.
2PL: CASCADING ABORTS
Schedule
2PL: CASCADING ABORTS
Schedule
2PL: CASCADING ABORTS
Schedule
2PL: CASCADING ABORTS
Schedule
This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
2PL: CASCADING ABORTS
Schedule
This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
Any information about $T_{1}$ cannot be "leaked" to the outside world.
2PL: CASCADING ABORTS
Schedule
This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
Any information about $T_{1}$ cannot be "leaked" to the outside world.
Any computation performed must be rolled back.
2PL: CASCADING ABORTS
Schedule
This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
Any information about $T_{1}$ cannot be "leaked" to the outside world.
Any computation performed must led back.
2PL OBSERVATIONS
There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
$\rightarrow$ Most DBMSs prefer correctness before performance.
May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention
2PL OBSERVATIONS
There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
$\rightarrow$ Most DBMSs prefer correctness before performance.
May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention
STRONG STRICT TWO-PHASE LOCKING
The txn is only allowed to release locks after it has ended (i.e., committed or aborted).
Allows only conflict serializable schedules, but it is often stronger than needed for some apps.
STRONG STRICT TWO-PHASE LOCKING
The txn is only allowed to release locks after it has ended (i.e., committed or aborted).
Allows only conflict serializable schedules, but it is often stronger than needed for some apps.
STRONG STRICT TWO-PHASE LOCKING
A schedule is strict if a value written by a txn is not read or overwritten by other txns until that txn finishes.
Advantages:
$\rightarrow$ Does not incur cascading aborts. $\rightarrow$ Aborted txns can be undone by just restoring original values of modified tuples.
EXAMPLES
T1 - Move $100 from Andy's account (A) to his bookie's account (B).
T2 - Compute the total amount in all accounts and return it to the application.
EXAMPLES
T1 - Move $100 from Andy's account (A) to his bookie's account (B).
T2 - Compute the total amount in all accounts and return it to the application.
NON-2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
NON-2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
NON-2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
NON-2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
T₂ Output
A+B=1900
2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
T₂ Output
A+B=2000
STRONG STRICT 2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
STRONG STRICT 2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
STRONG STRICT 2PL EXAMPLE
Schedule
Initial Database State
A=1000, B=1000
T Output
A+B=2000
UNIVERSE OF SCHEDULES
All Schedules
UNIVERSE OF SCHEDULES
All Schedules
UNIVERSE OF SCHEDULES
All Schedules
UNIVERSE OF SCHEDULES
All Schedules
View Serializable
Conflict Serializable
Serial
UNIVERSE OF SCHEDULES
UNIVERSE OF SCHEDULES
2PL OBSERVATIONS
There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
$\rightarrow$ Most DBMSs prefer correctness before performance.
May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention
2PL OBSERVATIONS
There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
$\rightarrow$ Most DBMSs prefer correctness before performance.
May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention
IT JUST GOT REAL
Schedule
A Lock Manager
IT JUST GOT REAL
IT JUST GOT REAL
IT JUST GOT REAL
IT JUST GOT REAL
IT JUST GOT REAL
2PL DEADLOCKS
A deadlock is a cycle of transactions waiting for locks to be released by each other.
Two ways of dealing with deadlocks: $\rightarrow$ Approach #1: Deadlock Detection $\rightarrow$ Approach #2: Deadlock Prevention
DEADLOCK DETECTION
The DBMS creates a waits- for graph to keep track of what locks each txn is waiting to acquire: $\rightarrow$ Nodes are transactions $\rightarrow$ Edge from $\mathsf{T}{\mathrm{i}}$ to $\mathsf{T}{\mathrm{j}}$ if $\mathsf{T}{\mathrm{i}}$ is waiting for $\mathsf{T}{\mathrm{j}}$ to release a lock.
The system periodically checks for cycles in waits- for graph and then decides how to break it.
DEADLOCK DETECTION
Schedule
Waits-For Graph
DEADLOCK DETECTION
Schedule
Waits-For Graph
DEADLOCK DETECTION
Schedule
Waits-For Graph
DEADLOCK DETECTION
Schedule
Waits-For Graph
DEADLOCK HANDLING
When the DBMS detects a deadlock, it will select a "victim" txn to rollback to break the cycle.
The victim txn will either restart or abort (more common) depending on how it was invoked.
There is a trade- off between the frequency of checking for deadlocks and how long txn's wait before deadlocks are broken.
DEADLOCK HANDLING: VICTIM SELECTION
Selecting the proper victim depends on a lot of different variables....
$\longrightarrow$ By age (lowest timestamp) $\longrightarrow$ By progress (least/most queries executed) $\longrightarrow$ By the # of items already locked $\longrightarrow$ By the # of txns that we have to rollback with it
DEADLOCK HANDLING: VICTIM SELECTION
Selecting the proper victim depends on a lot of different variables....
$\rightarrow$ By age (lowest timestamp) $\rightarrow$ By progress (least/most queries executed) $\rightarrow$ By the # of items already locked $\rightarrow$ By the # of txns that we have to rollback with it
We also should consider the # of times a txn has been restarted in the past to prevent starvation.
DEADLOCK HANDLING: ROLLBACK LENGTH
After selecting a victim txn to abort, the DBMS can also decide on how far to rollback the txn's changes.
Approach #1: Completely
$\rightarrow$ Rollback entire txn and tell the application it was aborted.
Approach #2: Partial (Savepoints)
$\rightarrow$ DBMS rolls back a portion of a txn (to break deadlock) and then attempts to re- execute the undone queries.
DEADLOCK PREVENTION
When a txn tries to acquire a lock that is held by another txn, the DBMS kills one of them to prevent a deadlock.
This approach does not require a waits- for graph or detection algorithm.
DEADLOCK PREVENTION
Assign priorities based on timestamps:
$\rightarrow$ Older Timestamp = Higher Priority (e.g., $\mathsf{T}_1 > \mathsf{T}_2$ )
Wait- Die ("Old Waits for Young")
$\rightarrow$ If requesting txn has higher priority than holding txn, then requesting txn waits for holding txn. $\rightarrow$ Otherwise requesting txn aborts.
Wound- Wait ("Young Waits for Old")
$\rightarrow$ If requesting txn has higher priority than holding txn, then holding txn aborts and releases lock. $\rightarrow$ Otherwise requesting txn waits.
DEADLOCK PREVENTION
DEADLOCK PREVENTION
DEADLOCK PREVENTION
DEADLOCK PREVENTION
DEADLOCK PREVENTION
DEADLOCK PREVENTION
Why do these schemes guarantee no deadlocks?
When a txn restarts, what is its (new) priority?
DEADLOCK PREVENTION
Why do these schemes guarantee no deadlocks?
Only one "type" of direction allowed when waiting for a lock.
When a txn restarts, what is its (new) priority?
DEADLOCK PREVENTION
Why do these schemes guarantee no deadlocks?
Only one "type" of direction allowed when waiting for a lock.
When a txn restarts, what is its (new) priority?
Its original timestamp to prevent it from getting starved for resources like an old man at a corrupt senior center.
OBSERVATION
All these examples have a one- to- one mapping from database objects to locks.
If a txn wants to update one billion tuples, then it must acquire one billion locks.
Acquiring locks is a more expensive operation than acquiring a latch even if that lock is available.
LOCK GRANULARITIES
When a txn wants to acquire a "lock", the DBMS can decide the granularity (i.e., scope) of that lock. $\rightarrow$ Attribute? Tuple? Page? Table?
The DBMS should ideally obtain fewest number of locks that a txn needs.
Trade- off between parallelism versus overhead. $\rightarrow$ Fewer Locks, Larger Granularity vs. More Locks, Smaller Granularity.
DATABASE LOCK HIERARCHY
DATABASE LOCK HIERARCHY
DATABASE LOCK HIERARCHY
DATABASE LOCK HIERARCHY
INTENTION LOCKS
An intention lock allows a higher- level node to be locked in shared or exclusive mode without having to check all descendent nodes.
If a node is locked in an intention mode, then some. txn is doing explicit locking at a lower level in the tree.
INTENTION LOCKS
Intention-Shared (IS)
$\rightarrow$ Indicates explicit locking at lower level with S locks. $\rightarrow$ Intent to get S lock(s) at finer granularity.
Intention-Exclusive (IX)
$\rightarrow$ Indicates explicit locking at lower level with X locks. $\rightarrow$ Intent to get X lock(s) at finer granularity.
Shared+Intention-Exclusive (SIX)
$\rightarrow$ The subtree rooted by that node is locked explicitly in S mode and explicit locking is being done at a lower level with X locks.
COMPATIBILITY MATRIX
Table (html):
| T2 Wants |
| IS | IX | S | SIX | X |
| IS | ✓ | ✓ | ✓ | ✓ | × |
| IX | ✓ | ✓ | × | × | × |
| S | ✓ | × | ✓ | × | × |
| SIX | ✓ | × | × | × | × |
| X | × | × | × | × | × |
LOCKING PROTOCOL
Each txn obtains appropriate lock at highest level of the database hierarchy.
To get S or IS lock on a node, the txn must hold at least IS on parent node.
To get X, IX, or SIX on a node, must hold at least IX on parent node.
EXAMPLE
T1 - Get the balance of Andy's off- shore bank account. T2 - Increase bookie's account balance by 1%. What locks should these txns obtain?
EXAMPLE
T1 - Get the balance of Andy's off- shore bank account.
T2 - Increase bookie's account balance by 1%.
What locks should these txns obtain?
$\rightarrow$ Exclusive + Shared for leaf nodes of lock tree. $\rightarrow$ Special Intention locks for higher levels.
EXAMPLE - TWO-LEVEL HIERARCHY
EXAMPLE - TWO-LEVEL HIERARCHY
Read Andy's record in R.
EXAMPLE - TWO-LEVEL HIERARCHY
Read Andy's record in R.
EXAMPLE - TWO-LEVEL HIERARCHY
Read Andy's record in R.
EXAMPLE - TWO-LEVEL HIERARCHY
Read Andy's record in R.
EXAMPLE - TWO-LEVEL HIERARCHY
Read Andy's record in R.
EXAMPLE - TWO-LEVEL HIERARCHY
EXAMPLE - TWO-LEVEL HIERARCHY
EXAMPLE - TWO-LEVEL HIERARCHY
EXAMPLE - THREE TXNS
Assume three txns execute at same time: $\rightarrow \mathsf{T}_1$ - Scan all tuples in R and update one tuple. $\rightarrow \mathsf{T}_2$ - Read a single tuple in R. $\rightarrow \mathsf{T}_3$ - Scan all tuples in R.
EXAMPLE - THREE TXNS
EXAMPLE - THREE TXNS
Scan all tuples in R and update one tuple.
EXAMPLE - THREE TXNS
Scan all tuples in R and update one tuple.
EXAMPLE - THREE TXNS
Scan all tuples in R and update one tuple.
EXAMPLE - THREE TXNS
Scan all tuples in R and update one tuple.
EXAMPLE - THREE TXNS
Scan all tuples in R and update one tuple.
EXAMPLE - THREE TXNS
EXAMPLE - THREE TXNS
EXAMPLE - THREE TXNS
EXAMPLE - THREE TXNS
Read a single tuple in R.
Table (html):
| IS | IX | S | SIX | X |
| IS | ✓ | ✓ | ✓ | ✓ | × |
| IX | ✓ | ✓ | × | × | × |
| S | ✓ | × | ✓ | × | × |
| SIX | ✓ | × | × | × | × |
| X | × | × | × | × | × |
EXAMPLE - THREE TXNS
EXAMPLE - THREE TXNS
[ImageCaption: Scan all tuples in R.]
EXAMPLE - THREE TXNS
[ImageCaption: Scan all tuples in R.]
EXAMPLE - THREE TXNS
[ImageCaption: Scan all tuples in R.]
EXAMPLE - THREE TXNS
Scan all tuples in R.
EXAMPLE - THREE TXNS
Scan all tuples in R.
LOCK ESCALATION
The DBMS can automatically switch to coarser- grained locks when a txn acquires too many low- level locks.
This reduces the number of requests that the lock manager must process.
LOCKING IN PRACTICE
Applications typically do not acquire a txn's locks manually (i.e., explicit SQL commands).
Sometimes you need to provide the DBMS with hints to help it to improve concurrency. $\rightarrow$ Update a tuple after reading it. $\rightarrow$ Skip any tuple that is locked.
Explicit locks are also useful when doing major changes to the database.
SELECT...FOR UPDATE
Perform a SELECT and then sets an exclusive lock on the matching tuples.
Can also set shared locks:
$\longrightarrow$ Postgres: FOR SHARE $\longrightarrow$ MySQL: LOCK IN SHARE MODE
Table (html):
| Requested Lock Mode | Current Lock Mode |
| FOR KEY SHARE | FOR SHARE | FOR NO KEY UPDATE | FOR UPDATE |
| FOR KEY SHARE | | | | X |
| FOR SHARE | | | X | X |
| FOR NO KEY UPDATE | | X | X | X |
| FOR UPDATE | X | X | X | X |
[TableCaption: Table 13.3.Conflicting Row-Level Locks]
SELECT * FROM
WHERE
FOR UPDATE;
SELECT...SKIP LOCKED
Perform a SELECT and automatically ignore any tuples that are already locked in an incompatible mode.
$\longrightarrow$ Useful for maintaining queues inside of a DBMS.
SELECT * FROM
WHERE
SKIP LOCKED;
CONCLUSION
2PL is used in almost every DBMS.
Automatically generates correct interleaving:
$\rightarrow$ Locks + protocol (2PL, SS2PL ...) $\rightarrow$ Deadlock detection + handling $\rightarrow$ Deadlock prevention
Many more things not discussed... $\rightarrow$ Nested Transactions $\rightarrow$ Savepoints
NEXT CLASS
Timestamp Ordering Concurrency ControlIsolation Levels