GithubDocuments
    1. Marketplace
    2. CMU Database - 15445 - 2025 Spring
    Collections
    CMU Database - 15445 - 2025 Spring
    cmu_database
    This collection of documents, "CMU Database - 15445 - 2025 Spring," provides a comprehensive overview of database systems, primarily focusing on the design, implementation, and ...
    DocumentsKnowledge Graph
    lecture-03-slides.pdf
    Carnegie Mellon University

    Database Systems

    Database Storage: Files & Pages

    ADMINISTRIVIA

    Project #0 is due January 26th @ 11
    Homework #1 is due January 29th @ 11
    Project #1 will be released on January 22nd

    LAST CLASS

    We now understand what a database looks like at a logical level and how to write queries to read/write data (e.g., using SQL).
    We will next learn how to build software that manages a database (i.e., a DBMS).

    COURSE OUTLINE

    Relational Databases Storage Query Execution Concurrency Control Database Recovery Distributed Databases Potpourri

    COURSE OUTLINE

    Relational DatabasesStorageQuery ExecutionConcurrency ControlDatabase RecoveryDistributed DatabasesPotpourri
    Table (html):
    Query Planning
    Operator Execution
    Access Methods
    Buffer Pool Manager
    Disk Manager

    COURSE OUTLINE

    Relational DatabasesStorageQuery ExecutionConcurrency ControlDatabase RecoveryDistributed DatabasesPotpourri

    COURSE OUTLINE

    Relational Databases
    Storage
    Query Execution
    Concurrency Control
    Database Recovery
    Distributed Databases
    Potpourri
    Query Planning
    Operator Execution
    Access Methods
    Buffer Pool Manager
    Disk Manager

    TODAY'S AGENDA

    BackgroundFile StoragePage LayoutTuple Layout

    DISK-BASED ARCHITECTURE

    The DBMS assumes that the primary storage location of the database is on non- volatile disk.
    The DBMS's components manage the movement of data between non- volatile and volatile storage.

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    STORAGE HIE

    Intel kills the remnants of Optane memory

    The speed- boosting storage tech was already on the ropes.
    By Michael Crider Staff Writer, PCWorld |JUL29,20226
    PDT
    If you haven't built a super- high- end workstation in a while, you might not have heard of Intel's Optane memory caching tech. Optane also powered ultra- fast SSDs for consumers and businesses alike. Not that it matters much now. After a disastrous second- quarter earnings call in which it missed expected revenue by billions of dollars, the company announced its plans to end its Optane memory business entirely.

    STORAGE HIERARCHY

    STORAGE HIERARCHY

    ACCESS TIMES

    Latency Numbers Every Programmer Should Know

    Table (html):
    1 ns L1 Cache Ref
    4 ns L2 Cache Ref
    100 ns DRAM
    16,000 ns SSD
    2,000,000 ns HDD
    ~50,000,000 ns Network Storage
    1,000,000,000 ns Tape Archives

    ACCESS TIMES

    Latency Numbers Every Programmer Should Know

    Table (html):
    1 nsL1 Cache Ref1 sec
    4 nsL2 Cache Ref4 sec
    100 nsDRAM100 sec
    16,000 nsSSD4.4 hours
    2,000,000 nsHDD3.3 weeks
    ~50,000,000 nsNetwork Storage1.5 years
    1,000,000,000 nsTape Archives31.7 years

    SEQUENTIAL VS. RANDOM ACCESS

    Random access on non- volatile storage is almost always much slower than sequential access.
    DBMS will want to maximize sequential access. $\rightarrow$ Algorithms try to reduce number of writes to random pages so that data is stored in contiguous blocks. $\rightarrow$ Allocating multiple pages at the same time is called an extent.

    SYSTEM DESIGN GOALS

    Allow the DBMS to manage databases that exceed the amount of memory available.
    Reading/writing to disk is expensive, so it must be managed carefully to avoid large stalls and performance degradation.
    Random access on disk is usually much slower than sequential access, so the DBMS will want to maximize sequential access.

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DISK-ORIENTED DBMS

    DATABASE STORAGE

    Problem #1: How the DBMS represents the database in files on disk.
    ← Today
    Problem #2: How the DBMS manages its memory and moves data back- and- forth from disk.

    FILE STORAGE

    The DBMS stores a database as one or more files on disk typically in a proprietary format.
    $\rightarrow$ The OS does not know anything about the contents of these files. $\rightarrow$ We will discuss portable file formats next week...
    Early systems in the 1980s used custom filesystems on raw block storage.
    $\rightarrow$ Some enterprise DBMSs still support this. $\rightarrow$ Most newer DBMSs do not do this.

    FILE STORAGE

    STORAGE MANAGER

    The storage manager is responsible for maintaining a database's files.
    $\rightarrow$ Some do their own scheduling for reads and writes to improve spatial and temporal locality of pages.
    It organizes the files as a collection of pages.
    $\rightarrow$ Tracks data read/written to pages. $\rightarrow$ Tracks the available space.
    A DBMS typically does not maintain multiple copies of a page on disk.
    $\rightarrow$ Assume this happens above/below storage manager.

    DATABASE PAGES

    A page is a fixed- size block of data. $\longrightarrow$ It can contain tuples, meta- data, indexes, log records... $\longrightarrow$ Most systems do not mix page types. $\longrightarrow$ Some systems require a page to be self- contained.
    Each page is given a unique identifier (page ID). $\longrightarrow$ A page ID could be unique per DBMS instance, per database, or per table. $\longrightarrow$ The DBMS uses an indirection layer to map page IDs to physical locations.

    DATABASE PAGES

    There are three different notions of "pages" in a DBMS:
    $\longrightarrow$ Hardware Page (usually 4KB) $\longrightarrow$ OS Page (usually 4KB, x64 2MB/1GB) $\longrightarrow$ Database Page (512B- 32KB)
    A hardware page is the largest block of data that the storage device can guarantee failsafe writes.
    DBMSs that specialize in read- only workloads have larger page sizes.

    DATABASE PAGES

    There are three different notions of "pages" in a DBMS:
    $\longrightarrow$ Hardware Page (usually 4KB) $\longrightarrow$ OS Page (usually 4KB, x64 2MB/1GB) $\longrightarrow$ Database Page (512B- 32KB)
    A hardware page is the largest block of data that the storage device can guarantee failsafe writes.
    DBMSs that specialize in read- only workloads have larger page sizes.

    DATABASE PAGES

    There are three different notions of "pages" in a DBMS:
    $\longrightarrow$ Hardware Page (usually 4KB) $\longrightarrow$ OS Page (usually 4KB, x64 2MB/1GB) $\longrightarrow$ Database Page (512B- 32KB)
    A hardware page is the largest block of data that the storage device can guarantee failsafe writes.
    DBMSs that specialize in read- only workloads have larger page sizes.

    Default DB Page Sizes

    PAGE STORAGE ARCHITECTURE

    Different DBMSs manage pages in files on disk in different ways.
    $\longrightarrow$ Heap File Organization $\longrightarrow$ Tree File Organization $\longrightarrow$ Sequential / Sorted File Organization (ISAM) $\longrightarrow$ Hashing File Organization
    At this point in the hierarchy, we do not need to know anything about what is inside of the pages.

    PAGE STORAGE ARCHITECTURE

    Different DBMSs manage pages in files on disk in different ways.
    $\longrightarrow$ Heap File Organization $\longrightarrow$ Tree File Organization $\longrightarrow$ Sequential / Sorted File Organization (ISAM) $\longrightarrow$ Hashing File Organization
    At this point in the hierarchy, we do not need to know anything about what is inside of the pages.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    Get Page #23

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE

    A heap file is an unordered collection of pages with tuples that are stored in random order. $\rightarrow$ Create / Get / Write / Delete Page $\rightarrow$ Must also support iterating over all pages.
    Need additional meta- data to track location of files and free space availability.

    HEAP FILE: PAGE DIRECTORY

    The DBMS maintains special pages that tracks the location of data pages in the database files.
    $\longrightarrow$ One entry per database object. $\longrightarrow$ Must make sure that the directory pages are in sync with the data pages.
    DBMS also keeps meta- data about pages' contents:
    $\longrightarrow$ Amount of free space per page. $\longrightarrow$ List of free / empty pages. $\longrightarrow$ Page type (data vs. meta- data).

    HEAP FILE: PAGE DIRECTORY

    The DBMS maintains special pages that tracks the location of data pages in the database files.
    $\longrightarrow$ One entry per database object. $\longrightarrow$ Must make sure that the directory pages are in sync with the data pages.
    DBMS also keeps meta- data about pages' contents:
    $\longrightarrow$ Amount of free space per page. $\longrightarrow$ List of free / empty pages. $\longrightarrow$ Page type (data vs. meta- data).

    TODAY'S AGENDA

    File Storage Page Layout Tuple Layout

    PAGE HEADER

    Every page contains a header of meta- data about the page's contents.
    $\longrightarrow$ Page Size $\longrightarrow$ Checksum $\longrightarrow$ DBMS Version $\longrightarrow$ Transaction Visibility $\longrightarrow$ Compression / Encoding Meta- data $\longrightarrow$ Schema Information $\longrightarrow$ Data Summary / Sketches
    Some systems require pages to be self- contained (e.g., Oracle).

    Page

    Table (html):
    Header
    Data

    PAGE LAYOUT

    For any page storage architecture, we now need to decide how to organize the data inside of the page. $\rightarrow$ We are still assuming that we are only storing tuples in a row- oriented storage model.
    Approach #1: Tuple- oriented Storage Approach #2: Log- structured Storage Approach #3: Index- organized Storage

    PAGE LAYOUT

    For any page storage architecture, we now need to decide how to organize the data inside of the page. $\rightarrow$ We are still assuming that we are only storing tuples in a row- oriented storage model.
    Approach #1: Tuple- oriented StorageApproach #2: Log- structured StorageApproach #3: Index- organized Storage

    PAGE LAYOUT

    For any page storage architecture, we now need to decide how to organize the data inside of the page. $\rightarrow$ We are still assuming that we are only storing tuples in a row- oriented storage model.
    Approach #1: Tuple- oriented Storage Approach #2: Log- structured Storage Approach #3: Index- organized Storage

    PAGE LAYOUT

    For any page storage architecture, we now need to decide how to organize the data inside of the page. $\rightarrow$ We are still assuming that we are only storing tuples in a row- oriented storage model.
    Approach #1: Tuple- oriented Storage
    Today
    Approach #2: Log- structured Storage
    Approach #3: Index- organized Storage

    PAGE LAYOUT

    For any page storage architecture, we now need to decide how to organize the data inside of the page. $\rightarrow$ We are still assuming that we are only storing tuples in a row- oriented storage model.
    Approach #1: Tuple- oriented Storage
    Approach #2: Log- structured Storage Approach #3: Index- organized Storage
    Lecture #4

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end.

    Page

    Table (html):
    Num Tuples = 0

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end.

    Page

    Table (html):
    Num Tuples = 3
    Tuple #1
    Tuple #2
    Tuple #3

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end. $\rightarrow$ What happens if we delete a tuple?

    Page

    Table (html):
    Num Tuples = 3
    Tuple #1
    Tuple #2
    Tuple #3

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end. $\rightarrow$ What happens if we delete a tuple?

    Page

    Table (html):
    Num Tuples = 2
    Tuple #1
    Tuple #3

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end. $\rightarrow$ What happens if we delete a tuple?

    Page

    Table (html):
    Num Tuples = 3
    Tuple #1
    Tuple #4
    Tuple #3

    TUPLE-ORIENTED STORAGE

    How to store tuples in a page?
    Strawman Idea: Keep track of the number of tuples in a page and then just append a new tuple to the end.
    $\rightarrow$ What happens if we delete a tuple? $\rightarrow$ What happens if we have a variable- length attribute?

    Page

    Table (html):
    Num Tuples = 3
    Tuple #1
    Tuple #4
    Tuple #3

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\rightarrow$ The # of used slots $\rightarrow$ The offset of the starting location of the last slot used.
    Table (html):
    Header
    Tuple #4Tuple #3
    Tuple #2Tuple #1

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    SLOTTED PAGES

    The most common layout scheme is called slotted pages.
    The slot array maps "slots" to the tuples' starting position offsets.
    The header keeps track of: $\longrightarrow$ The # of used slots $\longrightarrow$ The offset of the starting location of the last slot used.
    Fixed- and Var- length Tuple Data

    RECORD IDS

    The DBMS assigns each logical tuple a unique record identifier that represents its physical location in the database.
    $\longrightarrow$ File Id, Page Id, Slot # $\longrightarrow$ Most DBMSs do not store ids in tuple. $\longrightarrow$ SQLite uses ROWID as the true primary key and stores them as a hidden attribute.
    Applications should never rely on these IDs to mean anything.

    RECORD IDS

    The DBMS assigns each logical tuple a unique record identifier that represents its physical location in the database.
    $\longrightarrow$ File Id, Page Id, Slot # $\longrightarrow$ Most DBMSs do not store ids in tuple. $\longrightarrow$ SQLite uses ROWID as the true primary key and stores them as a hidden attribute.
    Applications should never rely on these IDs to mean anything.
    PostgreSQL CTID (6- bytes)
    Microsoft SQLServer %%physloc%% (8- bytes)
    ORACLE ROWID (10- bytes)

    TODAY'S AGENDA

    File Storage Page Layout Tuple Layout

    TUPLE LAYOUT

    A tuple is essentially a sequence of bytes. $\rightarrow$ These bytes do not have to be contiguous.
    It is the job of the DBMS to interpret those bytes into attribute types and values.

    TUPLE HEADER

    Each tuple is prefixed with a header that contains meta- data about it. $\rightarrow$ Visibility info (concurrency control) $\rightarrow$ Bit Map for NULL values.
    We do not need to store meta- data about the schema.

    Tuple

    Table (html):
    HeaderAttribute Data

    TUPLE DATA

    Attributes are typically stored in the order that you specify them when you create the table.
    This is done for software engineering reasons (i.e., simplicity).
    However, it might be more efficient to lay them out differently.

    Tuple

    Table (html):
    Headerabcde
    CREATE TABLE foo ( a INT PRIMARY KEY, b INT NOT NULL, c INT, d DOUBLE, e FLOAT);

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\rightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\rightarrow$ Can make updates more expensive.
    CREATE TABLE foo ( ) a INT PRIMARY KEY, b INT NOT NULL, ) ; CREATE TABLE bar ( ) c INT PRIMARY KEY, a INT REFERENCES foo (a)

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\rightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\rightarrow$ Can make updates more expensive.

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\longrightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\longrightarrow$ Can make updates more expensive.

    foo

    Table (html):
    Headerab

    bar

    Table (html):
    Headerca
    Headerca
    Headerca

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\longrightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\longrightarrow$ Can make updates more expensive.
    SELECT * FROM foo JOIN bar ON foo.a = bar.a;

    foo

    Table (html):
    Headerab

    bar

    Table (html):
    Headerca
    Headerca
    Headerca

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\longrightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\longrightarrow$ Can make updates more expensive.

    foo

    SELECT * FROM foo JOIN bar ON foo.a = bar.a;

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\rightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\rightarrow$ Can make updates more expensive.
    SELECT * FROM foo JOIN bar ON foo.a = bar.a;

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\longrightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\longrightarrow$ Can make updates more expensive.
    Not a new idea.
    $\longrightarrow$ IBM System R did this in the 1970s. $\longrightarrow$ Several NoSQL DBMSs do this without calling it physical denormalization.

    DENORMALIZED TUPLE DATA

    DBMS can physically denormalize (e.g., "pre- join") related tuples and store them together in the same page. $\longrightarrow$ Potentially reduces the amount of I/O for common workload patterns. $\longrightarrow$ Can make updates more expensive.
    Not a new idea.
    $\longrightarrow$ IBM System R did this in the 1970s. $\longrightarrow$ Several NoSQL DBMSs do this without calling it physical denormalization.

    CONCLUSION

    Database is organized in pages. Different ways to track pages. Different ways to store pages. Different ways to store tuples.

    NEXT CLASS

    Log- Structured Storage Index- Organized Storage Value Representation Catalogs