GithubDocuments
    1. Marketplace
    2. CMU Database - 15445 - 2025 Spring
    Collections
    CMU Database - 15445 - 2025 Spring
    cmu_database
    This collection of documents, "CMU Database - 15445 - 2025 Spring," provides a comprehensive overview of database systems, primarily focusing on the design, implementation, and ...
    DocumentsKnowledge Graph
    lecture-17-slides.pdf
    Carnegie Mellon University
    Database Systems Two- Phase Locking

    ADMINISTRIVIA

    Project #3 is due Sunday March 30th @ 11
    $\rightarrow$ Recitation: slides, recording

    UPCOMING DATABASE TALKS

    Monday @ 4
    on https://cmu.zoom.us/j/93441451665

    LAST CLASS

    Conflict Serializable

    $\rightarrow$ Verify using either the "swapping" method or dependency graphs. $\rightarrow$ Any DBMS that says that they support "serializable" isolation does this.

    View Serializable

    $\rightarrow$ No efficient way to verify. $\rightarrow$ No DBMS that supports this.

    OBSERVATION

    We need a way to guarantee that all execution schedules are correct (i.e., serializable) without knowing the entire schedule ahead of time.
    Solution: Use locks to protect database objects.

    LOCKS VS. LATCHES

    Table (html):
    LocksLatches
    Separate...TransactionsWorkers (threads, processes)
    Protect...Database ContentsIn-Memory Data Structures
    During...Entire TransactionsCritical Sections
    Modes...Shared, Exclusive, Update, IntentionRead, Write
    DeadlockDetection & ResolutionAvoidance
    ...by...Waits-for, Timeout, AbortsCoding Discipline
    Kept in...Lock ManagerProtected Data Structure

    LOCKS VS. LATCHES

    Table (html):
    LocksLatches
    Separate...TransactionsWorkers (threads, processes)
    Protect...Database ContentsIn-Memory Data Structures
    During...Entire TransactionsCritical Sections
    Modes...Shared, Exclusive, Update, IntentionRead, Write
    Deadlock ...by...Detection & ResolutionAvoidance
    Waits-for, Timeout, AbortsCoding Discipline
    Kept in...Lock ManagerProtected Data Structure

    EXECUTING WITH LOCKS

    Schedule

    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    TODAY'S AGENDA

    Lock TypesTwo- Phase LockingDeadlock Detection + PreventionHierarchical Locking

    BASIC LOCK TYPES

    S- LOCK: Shared locks for reads. X- LOCK: Exclusive locks for writes.
    Table (html):
    Shared S-LOCKExclusive X-LOCK
    Shared S-LOCK✓×
    Exclusive X-LOCK××
    [TableCaption: Compatibility Matrix]

    EXECUTING WITH LOCKS

    Transactions request locks (or upgrades). Lock manager grants or blocks requests. Transactions release locks. Lock manager updates its internal lock- table. $\rightarrow$ It keeps track of what transactions hold what locks and what transactions are waiting to acquire any locks.

    EXECUTING WITH LOCKS

    Schedule

    Lock Manager

    EXECUTING WITH LOCKS

    Schedule

    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule
    Lock Manager

    EXECUTING WITH LOCKS

    Schedule

    Lock Manager

    Granted $(T_{1} \Rightarrow A)$
    Released $(T_{1} \Rightarrow A)$
    Granted $(T_{2} \Rightarrow A)$
    Released $(T_{2} \Rightarrow A)$
    Granted $(T_{1} \Rightarrow A)$
    Released $(T_{1} \Rightarrow A)$

    CONCURRENCY CONTROL PROTOCOL

    Two- phase locking (2PL) is a concurrency control protocol that determines whether a txn can access an object in the database at runtime.
    The protocol does not need to know all the queries that a txn will execute ahead of time.

    TWO-PHASE LOCKING

    Phase #1: Growing

    $\rightarrow$ Each txn requests the locks that it needs from the DBMS's lock manager. $\rightarrow$ The lock manager grants/denies lock requests.

    Phase #2: Shrinking

    $\rightarrow$ The txn is allowed to only release/downgrade locks that it previously acquired. It cannot acquire new locks.

    TWO-PHASE LOCKING

    The txn is not allowed to acquire/upgrade locks after the growing phase finishes.
    [ImageCaption: Transaction Lifetime]

    TWO-PHASE LOCKING

    The txn is not allowed to acquire/upgrade locks after the growing phase finishes.
    [ImageCaption: Transaction Lifetime]

    TWO-PHASE LOCKING

    The txn is not allowed to acquire/upgrade locks after the growing phase finishes.

    EXECUTING WITH 2PL

    Schedule

    Lock Manager

    EXECUTING WITH 2PL

    Schedule
    Lock Manager

    EXECUTING WITH 2PL

    Schedule
    Lock Manager

    EXECUTING WITH 2PL

    Schedule
    Lock Manager

    EXECUTING WITH 2PL

    Schedule
    Lock Manager

    TWO-PHASE LOCKING

    2PL on its own is sufficient to guarantee conflict serializability because it generates schedules whose precedence graph is acyclic.
    But it is subject to cascading aborts.

    2PL: CASCADING ABORTS

    Schedule

    2PL: CASCADING ABORTS

    Schedule

    2PL: CASCADING ABORTS

    Schedule

    2PL: CASCADING ABORTS

    Schedule

    This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.

    2PL: CASCADING ABORTS

    Schedule

    This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
    Any information about $T_{1}$ cannot be "leaked" to the outside world.

    2PL: CASCADING ABORTS

    Schedule

    This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
    Any information about $T_{1}$ cannot be "leaked" to the outside world.
    Any computation performed must be rolled back.

    2PL: CASCADING ABORTS

    Schedule

    This is a permissible schedule in 2PL, but the DBMS has to also abort $T_{2}$ when $T_{1}$ aborts.
    Any information about $T_{1}$ cannot be "leaked" to the outside world.
    Any computation performed must led back.

    2PL OBSERVATIONS

    There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
    $\rightarrow$ Most DBMSs prefer correctness before performance.
    May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
    May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention

    2PL OBSERVATIONS

    There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
    $\rightarrow$ Most DBMSs prefer correctness before performance.
    May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
    May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention

    STRONG STRICT TWO-PHASE LOCKING

    The txn is only allowed to release locks after it has ended (i.e., committed or aborted).
    Allows only conflict serializable schedules, but it is often stronger than needed for some apps.

    STRONG STRICT TWO-PHASE LOCKING

    The txn is only allowed to release locks after it has ended (i.e., committed or aborted).
    Allows only conflict serializable schedules, but it is often stronger than needed for some apps.

    STRONG STRICT TWO-PHASE LOCKING

    A schedule is strict if a value written by a txn is not read or overwritten by other txns until that txn finishes.
    Advantages:
    $\rightarrow$ Does not incur cascading aborts. $\rightarrow$ Aborted txns can be undone by just restoring original values of modified tuples.

    EXAMPLES

    T1 - Move $100 from Andy's account (A) to his bookie's account (B).
    T2 - Compute the total amount in all accounts and return it to the application.

    EXAMPLES

    T1 - Move $100 from Andy's account (A) to his bookie's account (B).
    T2 - Compute the total amount in all accounts and return it to the application.

    NON-2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    NON-2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    NON-2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    NON-2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000
    T₂ Output
    A+B=1900

    2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000
    T₂ Output
    A+B=2000

    STRONG STRICT 2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    STRONG STRICT 2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000

    STRONG STRICT 2PL EXAMPLE

    Schedule

    Initial Database State

    A=1000, B=1000
    T Output
    A+B=2000

    UNIVERSE OF SCHEDULES

    All Schedules

    UNIVERSE OF SCHEDULES

    All Schedules

    UNIVERSE OF SCHEDULES

    All Schedules

    UNIVERSE OF SCHEDULES

    All Schedules
    View Serializable
    Conflict Serializable
    Serial

    UNIVERSE OF SCHEDULES

    UNIVERSE OF SCHEDULES

    2PL OBSERVATIONS

    There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
    $\rightarrow$ Most DBMSs prefer correctness before performance.
    May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
    May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention

    2PL OBSERVATIONS

    There are potential schedules that are serializable but would not be allowed by 2PL because locking limits concurrency.
    $\rightarrow$ Most DBMSs prefer correctness before performance.
    May still have "dirty reads". $\rightarrow$ Solution: Strong Strict 2PL (aka Rigorous 2PL)
    May lead to deadlocks. $\rightarrow$ Solution: Detection or Prevention

    IT JUST GOT REAL

    Schedule

    A Lock Manager

    IT JUST GOT REAL

    IT JUST GOT REAL

    IT JUST GOT REAL

    IT JUST GOT REAL

    IT JUST GOT REAL

    2PL DEADLOCKS

    A deadlock is a cycle of transactions waiting for locks to be released by each other.
    Two ways of dealing with deadlocks: $\rightarrow$ Approach #1: Deadlock Detection $\rightarrow$ Approach #2: Deadlock Prevention

    DEADLOCK DETECTION

    The DBMS creates a waits- for graph to keep track of what locks each txn is waiting to acquire: $\rightarrow$ Nodes are transactions $\rightarrow$ Edge from $\mathsf{T}{\mathrm{i}}$ to $\mathsf{T}{\mathrm{j}}$ if $\mathsf{T}{\mathrm{i}}$ is waiting for $\mathsf{T}{\mathrm{j}}$ to release a lock.
    The system periodically checks for cycles in waits- for graph and then decides how to break it.

    DEADLOCK DETECTION

    Schedule

    Waits-For Graph

    DEADLOCK DETECTION

    Schedule

    Waits-For Graph

    DEADLOCK DETECTION

    Schedule

    Waits-For Graph

    DEADLOCK DETECTION

    Schedule

    Waits-For Graph

    DEADLOCK HANDLING

    When the DBMS detects a deadlock, it will select a "victim" txn to rollback to break the cycle.
    The victim txn will either restart or abort (more common) depending on how it was invoked.
    There is a trade- off between the frequency of checking for deadlocks and how long txn's wait before deadlocks are broken.

    DEADLOCK HANDLING: VICTIM SELECTION

    Selecting the proper victim depends on a lot of different variables....
    $\longrightarrow$ By age (lowest timestamp) $\longrightarrow$ By progress (least/most queries executed) $\longrightarrow$ By the # of items already locked $\longrightarrow$ By the # of txns that we have to rollback with it

    DEADLOCK HANDLING: VICTIM SELECTION

    Selecting the proper victim depends on a lot of different variables....
    $\rightarrow$ By age (lowest timestamp) $\rightarrow$ By progress (least/most queries executed) $\rightarrow$ By the # of items already locked $\rightarrow$ By the # of txns that we have to rollback with it
    We also should consider the # of times a txn has been restarted in the past to prevent starvation.

    DEADLOCK HANDLING: ROLLBACK LENGTH

    After selecting a victim txn to abort, the DBMS can also decide on how far to rollback the txn's changes.
    Approach #1: Completely
    $\rightarrow$ Rollback entire txn and tell the application it was aborted.
    Approach #2: Partial (Savepoints)
    $\rightarrow$ DBMS rolls back a portion of a txn (to break deadlock) and then attempts to re- execute the undone queries.

    DEADLOCK PREVENTION

    When a txn tries to acquire a lock that is held by another txn, the DBMS kills one of them to prevent a deadlock.
    This approach does not require a waits- for graph or detection algorithm.

    DEADLOCK PREVENTION

    Assign priorities based on timestamps:
    $\rightarrow$ Older Timestamp = Higher Priority (e.g., $\mathsf{T}_1 > \mathsf{T}_2$ )
    Wait- Die ("Old Waits for Young")
    $\rightarrow$ If requesting txn has higher priority than holding txn, then requesting txn waits for holding txn. $\rightarrow$ Otherwise requesting txn aborts.
    Wound- Wait ("Young Waits for Old")
    $\rightarrow$ If requesting txn has higher priority than holding txn, then holding txn aborts and releases lock. $\rightarrow$ Otherwise requesting txn waits.

    DEADLOCK PREVENTION

    DEADLOCK PREVENTION

    DEADLOCK PREVENTION

    DEADLOCK PREVENTION

    DEADLOCK PREVENTION

    DEADLOCK PREVENTION

    Why do these schemes guarantee no deadlocks?
    When a txn restarts, what is its (new) priority?

    DEADLOCK PREVENTION

    Why do these schemes guarantee no deadlocks?
    Only one "type" of direction allowed when waiting for a lock.
    When a txn restarts, what is its (new) priority?

    DEADLOCK PREVENTION

    Why do these schemes guarantee no deadlocks?
    Only one "type" of direction allowed when waiting for a lock.
    When a txn restarts, what is its (new) priority?
    Its original timestamp to prevent it from getting starved for resources like an old man at a corrupt senior center.

    OBSERVATION

    All these examples have a one- to- one mapping from database objects to locks.
    If a txn wants to update one billion tuples, then it must acquire one billion locks.
    Acquiring locks is a more expensive operation than acquiring a latch even if that lock is available.

    LOCK GRANULARITIES

    When a txn wants to acquire a "lock", the DBMS can decide the granularity (i.e., scope) of that lock. $\rightarrow$ Attribute? Tuple? Page? Table?
    The DBMS should ideally obtain fewest number of locks that a txn needs.
    Trade- off between parallelism versus overhead. $\rightarrow$ Fewer Locks, Larger Granularity vs. More Locks, Smaller Granularity.

    DATABASE LOCK HIERARCHY

    DATABASE LOCK HIERARCHY

    DATABASE LOCK HIERARCHY

    DATABASE LOCK HIERARCHY

    INTENTION LOCKS

    An intention lock allows a higher- level node to be locked in shared or exclusive mode without having to check all descendent nodes.
    If a node is locked in an intention mode, then some. txn is doing explicit locking at a lower level in the tree.

    INTENTION LOCKS

    Intention-Shared (IS)

    $\rightarrow$ Indicates explicit locking at lower level with S locks. $\rightarrow$ Intent to get S lock(s) at finer granularity.

    Intention-Exclusive (IX)

    $\rightarrow$ Indicates explicit locking at lower level with X locks. $\rightarrow$ Intent to get X lock(s) at finer granularity.

    Shared+Intention-Exclusive (SIX)

    $\rightarrow$ The subtree rooted by that node is locked explicitly in S mode and explicit locking is being done at a lower level with X locks.

    COMPATIBILITY MATRIX

    Table (html):
    T2 Wants
    ISIXSSIXX
    IS✓✓✓✓×
    IX✓✓×××
    S✓×✓××
    SIX✓××××
    X×××××

    LOCKING PROTOCOL

    Each txn obtains appropriate lock at highest level of the database hierarchy.
    To get S or IS lock on a node, the txn must hold at least IS on parent node.
    To get X, IX, or SIX on a node, must hold at least IX on parent node.

    EXAMPLE

    T1 - Get the balance of Andy's off- shore bank account. T2 - Increase bookie's account balance by 1%. What locks should these txns obtain?

    EXAMPLE

    T1 - Get the balance of Andy's off- shore bank account.
    T2 - Increase bookie's account balance by 1%.
    What locks should these txns obtain?
    $\rightarrow$ Exclusive + Shared for leaf nodes of lock tree. $\rightarrow$ Special Intention locks for higher levels.

    EXAMPLE - TWO-LEVEL HIERARCHY

    EXAMPLE - TWO-LEVEL HIERARCHY

    Read Andy's record in R.

    EXAMPLE - TWO-LEVEL HIERARCHY

    Read Andy's record in R.

    EXAMPLE - TWO-LEVEL HIERARCHY

    Read Andy's record in R.

    EXAMPLE - TWO-LEVEL HIERARCHY

    Read Andy's record in R.

    EXAMPLE - TWO-LEVEL HIERARCHY

    Read Andy's record in R.

    EXAMPLE - TWO-LEVEL HIERARCHY

    EXAMPLE - TWO-LEVEL HIERARCHY

    EXAMPLE - TWO-LEVEL HIERARCHY

    EXAMPLE - THREE TXNS

    Assume three txns execute at same time: $\rightarrow \mathsf{T}_1$ - Scan all tuples in R and update one tuple. $\rightarrow \mathsf{T}_2$ - Read a single tuple in R. $\rightarrow \mathsf{T}_3$ - Scan all tuples in R.

    EXAMPLE - THREE TXNS

    EXAMPLE - THREE TXNS

    Scan all tuples in R and update one tuple.

    EXAMPLE - THREE TXNS

    Scan all tuples in R and update one tuple.

    EXAMPLE - THREE TXNS

    Scan all tuples in R and update one tuple.

    EXAMPLE - THREE TXNS

    Scan all tuples in R and update one tuple.

    EXAMPLE - THREE TXNS

    Scan all tuples in R and update one tuple.

    EXAMPLE - THREE TXNS

    EXAMPLE - THREE TXNS

    EXAMPLE - THREE TXNS

    EXAMPLE - THREE TXNS

    Read a single tuple in R.
    Table (html):
    ISIXSSIXX
    IS✓✓✓✓×
    IX✓✓×××
    S✓×✓××
    SIX✓××××
    X×××××

    EXAMPLE - THREE TXNS

    EXAMPLE - THREE TXNS

    [ImageCaption: Scan all tuples in R.]

    EXAMPLE - THREE TXNS

    [ImageCaption: Scan all tuples in R.]

    EXAMPLE - THREE TXNS

    [ImageCaption: Scan all tuples in R.]

    EXAMPLE - THREE TXNS

    Scan all tuples in R.

    EXAMPLE - THREE TXNS

    Scan all tuples in R.

    LOCK ESCALATION

    The DBMS can automatically switch to coarser- grained locks when a txn acquires too many low- level locks.
    This reduces the number of requests that the lock manager must process.

    LOCKING IN PRACTICE

    Applications typically do not acquire a txn's locks manually (i.e., explicit SQL commands).
    Sometimes you need to provide the DBMS with hints to help it to improve concurrency. $\rightarrow$ Update a tuple after reading it. $\rightarrow$ Skip any tuple that is locked.
    Explicit locks are also useful when doing major changes to the database.

    SELECT...FOR UPDATE

    Perform a SELECT and then sets an exclusive lock on the matching tuples.
    Can also set shared locks:
    $\longrightarrow$ Postgres: FOR SHARE $\longrightarrow$ MySQL: LOCK IN SHARE MODE
    Table (html):
    Requested Lock ModeCurrent Lock Mode
    FOR KEY SHAREFOR SHAREFOR NO KEY UPDATEFOR UPDATE
    FOR KEY SHAREX
    FOR SHAREXX
    FOR NO KEY UPDATEXXX
    FOR UPDATEXXXX
    [TableCaption: Table 13.3.Conflicting Row-Level Locks]
    SELECT * FROM
    WHERE FOR UPDATE;

    SELECT...SKIP LOCKED

    Perform a SELECT and automatically ignore any tuples that are already locked in an incompatible mode.
    $\longrightarrow$ Useful for maintaining queues inside of a DBMS.
    SELECT * FROM
    WHERE SKIP LOCKED;

    CONCLUSION

    2PL is used in almost every DBMS.
    Automatically generates correct interleaving:
    $\rightarrow$ Locks + protocol (2PL, SS2PL ...) $\rightarrow$ Deadlock detection + handling $\rightarrow$ Deadlock prevention
    Many more things not discussed... $\rightarrow$ Nested Transactions $\rightarrow$ Savepoints

    NEXT CLASS

    Timestamp Ordering Concurrency ControlIsolation Levels