Carnegie Mellon University
Database Systems
Query Planning & Optimization
ADMINISTRIVIA
Project #3 is due Sunday March 30th @ 11
$\rightarrow$ Recitation: slides, recording
Mid- term exam:
$\rightarrow$ Exam viewing for the next 3 OH, including today. $\rightarrow$ The last OH for exam viewing is on March 24.
UPCOMING DATABASE TALKS
MALLOY Mar 17 Lloyd Tabb Malloy: A Modern Open Source Language for Analyzing, Transforming, and Modeling Data
Table (html):
| PRQL | Mar 24 | Tobias Brandt | PRQL: Pipelined Relational Query Language |
| StarRocks | Mar 31 | Kaisen Kang | StarRocks Query Optimizer |
| Oxide | Apr 7 | Ben Naecker | OxQL: Oximeter Query Language |
| MariaDB | Apr 14 | Michael
Widenius | MariaDB's New Query Optimizer |
| Apr 21 | Michael
Sullivan | EdgeQL with Gel |
LAST CLASS
We talked about how to design the DBMS's architecture to execute queries in parallel.
The query plan is comprised of physical operators that specify the algorithm to invoke at each step of the plan.
But how do we go from SQL to a query plan?
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445(645 (Spring 2025)
(50 + 50,000) reads + 1,000,000 writes Write temp file T1 5 tuples per page in T1
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445(645 (Spring 2025)]
1,000,000 reads + 2,000 writes (FK join, 10k tuples in temp T2)
(50 + 50,000) reads + 1,000,000 writes Write temp file T1 5 tuples per page in T1
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445(645 (Spring 2025)]
2,000 reads + 4 writes (10K/500 = 20 emps per dept)
1,000,000 reads + 2,000 writes (FK join, 10k tuples in temp T2)
(50 + 50,000) reads + 1,000,000 writes Write temp file T1 5 tuples per page in T1
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445(645 (Spring 2025)]
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445/645 (Spring 2025)]
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
(50 + 50,000) reads + 2,000 writes Page Nested- Loop Join Write Temp T1 Emp Dept
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445(645 (Spring 2025)
2,000 reads + 4 writes Read temp T1, Write temp T2 $(50 + 50,000)$ reads + 2,000 writes Page Nested- Loop Join Write Temp T1 Emp
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445/645 (Spring 2025)]
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
[ImageFootnote: CMU-DB 15-445/645 (Spring 2025)]
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
Materialization Model Total: 7,159 l/Os
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
Total: 7,159 l/Os
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
Total: 7,159 l/Os
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445(645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
CMU- DB 15- 445 (645 (Spring 2025)
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
3 reads + 1 writes Access: Index(dname) Dept
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
1+3 (idx) + 20 (ptr chase) reads + 4 writes Emp.did = Dept.did Index Nested- Loop Join Emp.dname = 'Toy' 3 reads + 1 writes Access: Index(dname) Dept
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
MOTIVATION
SELECT DISTINCT ename FROM Emp E JOIN Dept D ON E.did = D.did WHERE D.dname = 'Toy'
Catalog
TODAY'S AGENDA
BackgroundHeuristic / Ruled- based OptimizationCost- based OptimizationCost Model Estimation
ARCHITECTURE OVERVIEW
LOGICAL VS. PHYSICAL PLANS
The optimizer generates a mapping of a logical algebra expression to the optimal equivalent physical algebra expression.
Physical operators define a specific execution strategy using an access path. $\rightarrow$ They can depend on the physical format of the data that they process (i.e., sorting, compression). $\rightarrow$ Not always a 1
mapping from logical to physical.
Annotated RA Tree a.k.a. The Physical Plan
QUERY OPTIMIZATION (QO)
- Identify candidate equivalent trees (logical). It is an NP-hard problem, so the space is large.
- For each candidate, find the execution plan (physical). Estimate the cost of each plan.
- Choose the best (physical) plan.
QUERY OPTIMIZATION (QO)
- Identify candidate equivalent trees (logical). It is an NP-hard problem, so the space is large.
- For each candidate, find the execution plan (physical). Estimate the cost of each plan.
- Choose the best (physical) plan.
QUERY OPTIMIZATION (QO)
- Identify candidate equivalent trees (logical). It is an NP-hard problem, so the space is large.
- For each candidate, find the execution plan (physical). Estimate the cost of each plan.
- Choose the best (physical) plan.
QUERY OPTIMIZATION (QO)
- Identify candidate equivalent trees (logical). It is an NP-hard problem, so the space is large.
- For each candidate, find the execution plan (physical). Estimate the cost of each plan.
- Choose the best (physical) plan.
Practically: Choose from a subset of all possible plans.
QUERY OPTIMIZATION
Heuristics / Rules
$\rightarrow$ Rewrite the query to remove (guessed) inefficiencies. $\rightarrow$ Examples: always do selections first or push down projections as early as possible. $\rightarrow$ These techniques may need to examine catalog, but they do not need to examine data.
Cost-based Search
$\rightarrow$ Use a model to estimate the cost of executing a plan. $\rightarrow$ Enumerate multiple equivalent plans for a query and pick the one with the lowest cost.
QUERY OPTIMIZATION
Heuristics / Rules
$\rightarrow$ Rewrite the query to remove (guessed) inefficiencies. $\rightarrow$ Examples: always do selections first or push down projections as early as possible. $\rightarrow$ These techniques may need to examine catalog, but they do not need to examine data.
Cost-based Search
$\rightarrow$ Use a model to estimate the cost of executing a plan. $\rightarrow$ Enumerate multiple equivalent plans for a query and pick the one with the lowest cost.
LOGICAL PLAN OPTIMIZATION
Transform a logical plan into an equivalent logical plan using pattern matching rules.
The goal is to increase the likelihood of enumerating the optimal plan in the search. $\rightarrow$ Many equivalence rules for relational algebra!
Cannot compare plans because there is no cost model but can "direct" a transformation to a preferred side.
PREDICATE PUSHDOWN
$\pi_{\mathrm{name}}\left(\sigma_{\mathrm{dname} = '\mathrm{Toy}}(\mathrm{Dept}\ltimes \mathrm{Emp})\right)$
PREDICATE PUSHDOWN
$\pi_{\mathrm{name}}\left(\sigma_{\mathrm{dname} = '\mathrm{Toy}}\left(\mathrm{Dept}\ltimes \mathrm{Emp}\right)\right)$
PREDICATE PUSHDOWN
$\pi_{\mathrm{name}}\left(\sigma_{\mathrm{dname} = '\mathrm{Toy}}\left(\mathrm{Dept}\ltimes \mathrm{Emp}\right)\right)$
PREDICATE PUSHDOWN
$\pi_{\mathrm{ename}}\left(\sigma_{\mathrm{dname} = '\mathrm{Toy}}(\mathrm{Dept}\bowtie \mathrm{Emp})\right)$
PREDICATE PUSHDOWN
REPLACE CARTESIAN PRODUCT
$$
\ldots \left(\sigma_{\mathrm{Dept.did} = \mathrm{Emp.did}}(\mathrm{Dept}\times \mathrm{Emp})\right)
$$
REPLACE CARTESIAN PRODUCT
O Dept.did = Emp.did (Dept x Emp)
REPLACE CARTESIAN PRODUCT
PROJECTION PUSHDOWN
Emp. did = Dept. did
PROJECTION PUSHDOWN
Emp.ename
PROJECTION PUSHDOWN
Emp.ename
Emp.ename
QUERY OPTIMIZATION
Heuristics / Rules
$\rightarrow$ Rewrite the query to remove (guessed) inefficiencies. $\rightarrow$ Examples: always do selections first or push down projections as early as possible. $\rightarrow$ These techniques may need to examine catalog, but they do not need to examine data.
Cost-based Search
$\rightarrow$ Use a model to estimate the cost of executing a plan. $\rightarrow$ Enumerate multiple equivalent plans for a query and pick the one with the lowest cost.
COST-BASED QUERY OPTIMIZATION
We will start with cost- based, bottom- up QO → a.k.a. the "classic" IBM System R optimizer
Approach: Enumerate different plans for the query and estimate their costs.
→ Single relation. → Multiple relations. → Nested sub- queries.
It chooses the best plan it has seen for the query after exhausting all plans or some timeout.
SINGLE-RELATION QUERY PLANNING
Pick the best access method.
$\longrightarrow$ Sequential Scan $\longrightarrow$ Binary Search (clustered indexes) $\longrightarrow$ Index Scan
Predicate evaluation ordering.
Simple heuristics are often good enough for this.
MULTI-RELATION QUERY PLANNING
Approach #1: Generative / Bottom-Up
$\rightarrow$ Start with nothing and then iteratively assemble and add building blocks to generate a query plan. $\rightarrow$ Examples: System R, Starburst
Approach #2: Transformation / Top-Down
$\rightarrow$ Start with the outcome that the query wants, and then transform it to equivalent alternative sub- plans to find the optimal plan that gets to that goal. $\rightarrow$ Examples: Volcano, Cascades
BOTTOM-UP OPTIMIZATION
Use static rules to perform initial optimization. Then use dynamic programming to determine the best join order for tables using a divide- and- conquer search method
Examples: IBM System R, DB2, MySQL, Postgres, most open- source DBMSs.
SYSTEM R OPTIMIZER
Break query into blocks and generate logical operators for each block.
For each logical operator, generate a set of physical operators that implement it.
$\rightarrow$ All combinations of join algorithms and access paths
Then, iteratively construct a "left- deep" join tree that minimizes the estimated amount of work to execute the plan.
Left-Deep Tree
Bushy Tree
SYSTEM R OPTIMIZER
Break query into blocks and generate logical operators for each block.
For each logical operator, generate a set of physical operators that implement it.
$\rightarrow$ All combinations of join algorithms and access paths
Then, iteratively construct a "left- deep" join tree that minimizes the estimated amount of work to execute the plan.
Left-Deep Tree
Bushy Tree
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
Step #1: Choose the best access paths to each table
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
Step #1: Choose the best access paths to each table
ARTIST: Sequential Scan APPEARS: Sequential Scan ALBUM: Index Look- up on NAME
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
Step #1: Choose the best access paths to each table Step #2: Enumerate all possible join orderings for tables
ARTIST: Sequential Scan APPEARS: Sequential Scan ALBUM: Index Look- up on NAME
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
Step #1: Choose the best access paths to each table
Step #2: Enumerate all possible join orderings for tables
ARTIST: Sequential Scan
APPEARS: Sequential Scan
ALBUM: Index Look- up on NAME
ARTIST APPEARS ALBUM APPEARS ALBUM ARTIST ALBUM APPEARS ARTIST APPEARS ARTIST ALBUM ARTIST X ALBUM APPEARS ALBUM X ARTIST APPEARS : : : :
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME FROM ARTIST, APPEARS, ALBUM WHERE ARTIST.ID=APPEARS.ARTIST_ID AND APPEARS.ALBUM_ID=ALBUM.ID AND ALBUM.NAME="Andy's OG Remix" ORDER BY ARTIST.ID
Step #1: Choose the best access paths to each table
Step #2: Enumerate all possible join orderings for tables
Step #3: Determine the join ordering with the lowest cost
ARTIST: Sequential Scan
APPEARS: Sequential Scan
ALBUM: Index Look- up on NAME
ARTIST APPEARS ALBUM APPEARS ALBUM ARTIST ALBUM APPEARS ARTIST APPEARS ARTIST ALBUM ARTIST X ALBUM APPEARS ALBUM X ARTIST APPEARS : : : :
Logical Op Physical Op
SYSTEM R OPTIMIZER
ARTIST APPEARS ALBUM
Logical Op Physical Op
SYSTEM R OPTIMIZER
ARTIST APPEARS ALBUM
Logical Op Physical Op
SYSTEM R OPTIMIZER
ARTIST APPEARS ALBUM
Logical Op Physical Op
SYSTEM R OPTIMIZER
Logical Op Physical Op
SYSTEM R OPTIMIZER
SYSTEM R OPTIMIZER
SYSTEM R OPTIMIZER
SYSTEM R OPTIMIZER
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be. Perform a branch- and- bound search to traverse the plan tree by converting logical operators into physical operators. $\rightarrow$ Keep track of global best plan during search. $\rightarrow$ Treat physical properties of data as first- class entities during planning.
Examples: MSSQL, Greenplum, CockroachDB
Example Logical Rules
P1 (P2(R)) = P2 (P1(R)) (commutativity)
P1AP2 ...APn (R) = P1(P2( ... Pn(R))) (cascading o)
$\begin{array}{r}\prod_{\mathrm{a}1}(\mathrm{R})\equiv \prod_{\mathrm{a}1}(\prod_{\mathrm{a}2}(\dots \prod_{\mathrm{ak}}(\mathrm{R})\dots)),\mathsf{a}{\mathrm{i}}\subseteq \mathsf{a}{\mathrm{i} + 1} \end{array}$ (cascading I)
R S S R (join commutativity)
R (S T) = (R S) T (join associativity)
$\sigma_{\mathrm{p}}(\mathrm{RXS})\equiv (\mathrm{R}\bowtie_{\mathrm{p}}\mathrm{S})$ , if P is a join predicate
$\sigma_{\mathrm{p}}(\mathrm{RXS})\equiv \sigma_{\mathrm{p1}}(\sigma_{\mathrm{p2}}(\mathrm{R})\bowtie_{\mathrm{p4}}\sigma_{\mathrm{p3}}(\mathrm{S}))$ , where $\mathrm{P} = \mathrm{p}1\wedge \mathrm{p}2\wedge \mathrm{p}3\wedge \mathrm{p}4$
$\begin{array}{r}\prod_{\mathrm{A1,A2,\ldots An}}(\sigma_{\mathrm{P}}(\mathrm{R}))\equiv \prod_{\mathrm{A1,A2,\ldots An}}(\sigma_{\mathrm{P}}(\prod_{\mathrm{A1,\ldots An,B1,\ldots BM}}\mathrm{R})), \end{array}$ where B1 ... BM are columns in P
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
ARTIST $\bowtie$ APPEARS $\bowtie$ ALBUM ORDER- BY(ARTIST.ID)
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A, B) to JOIN(B, A) $\rightarrow$ Logical $\rightarrow$ Physical
(A, B) to HASH_JOIN(A, B)
ARTIST $\bowtie$ APPEARS $\bowtie$ ALBUMORDER- BY(ARTIST.ID)
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
ARTIST APPEARS ALBUM ORDER- BY(ARTIST.ID)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical
(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical
(A,B) to HASH_JOIN(A,B)
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
LogicalOp PhysicalOp
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\longrightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\longrightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
Logical Op
Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\longrightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\longrightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
Logical Op
Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\longrightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\longrightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
Logical Op
Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\longrightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\longrightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
Logical Op Physical Op
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\rightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\rightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
LogicalOp PhysicalOp
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query to be.
Invoke rules to create new nodes and traverse tree.
$\longrightarrow$ Logical $\rightarrow$ Logical: JOIN(A,B) to JOIN(B,A) $\longrightarrow$ Logical $\rightarrow$ Physical: JOIN(A,B) to HASH_JOIN(A,B)
Can create "enforcer" rules that require input to have certain properties.
OBSERVATION
Applications often execute nested queries.
$\rightarrow$ We could optimize each block using the methods we have discussed. $\rightarrow$ However, this may be inefficient since we optimize each block separately without a global approach.
What if we could flatten a nested query into a single block and optimize it?
$\rightarrow$ Then, apply single- block query optimization methods. $\rightarrow$ Even if one cannot flatten to a single block, flattening to fewer blocks is still beneficial.
NESTED SUB-QUERIES
The DBMS treats nested sub- queries in the where clause as functions that take parameters and return a single value or set of values.
Approach #1: Rewrite to de- correlate and/or flatten them.
Approach #2: Decompose nested query and store results in a temporary table.
NESTED SUB-QUERIES: REWRITE
SELECT name FROM sailors AS S WHERE EXISTS (SELECT * FROM reserves AS R WHERE S.sid = R.sid AND R.day = '2024- 10- 25')
NESTED SUB-QUERIES: REWRITE
SELECT name FROM sailors AS S WHERE EXISTS (SELECT * FROM reserves AS R WHERE S.sid = R.sid AND R.day = '2024- 10- 25')
NESTED SUB-QUERIES: REWRITE
DECOMPOSING QUERIES
For harder queries, the optimizer breaks up queries into blocks and then concentrates on one block at a time.
Sub- queries are written to temporary tables that are discarded after the query finishes.
DECOMPOSING QUERIES
SELECT S.sid, MIN(R.day) FROM sailors S, reserves R, boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = 'red' AND S.rating = (SELECT MAX(S2. rating) FROM sailors S2) GROUP BY S.sid HAVING COUNT(*) > 1
DECOMPOSING QUERIES
DECOMPOSING QUERIES
SELECT MAX(rating) FROM sailors
SELECT S.sid, MIN(R.day)
FROM sailors S, reserves R, boats B
WHERE S.sid = R.sid
AND R.bid = B.bid
AND B.color = 'red'
AND S.rating = (SELECT MAX(S2. rating) FROM sailors S2)
GROUP BY S.sid HAVING COUNT(*) > 1
Nested Block
DECOMPOSING QUERIES
SELECT MAX(rating) FROM sailors
SELECT S.sid, MIN(R.day)
FROM sailors S, reserves R, boats B
WHERE S.sid = R.sid
AND R.bid = B.bid
AND B.color = 'red'
AND S.rating = ###
GROUP BY S.sidHAVING COUNT(*) > 1
Nested Block
DECOMPOSING QUERIES
SELECT MAX(rating) FROM sailors
SELECT S.sid, MIN(R.day)
FROM sailors S, reserves R, boats B
WHERE S.sid = R.sid
AND R.bid = B.bid
AND B.color = 'red'
AND S.rating = ###
GROUP BY S.sidHAVING COUNT(*) > 1
DECOMPOSING QUERIES
Inner Block
SELECT MAX(rating) FROM sailors
SELECT S.sid, MIN(R.day)
FROM sailors S, reserves R, boats B
WHERE S.sid = R.sid
AND R.bid = B.bid
AND B.color = 'red'
AND S.rating = ###
GROUP BY S.sidHAVING COUNT(*) > 1
Outer Block
EXPRESSION REWRITING
An optimizer transforms a query's expressions (e.g., WHERE/ON clause predicates) into the minimal set of expressions.
Implemented using if/then/else clauses or a pattern- matching rule engine.
$\rightarrow$ Search for expressions that match a pattern. $\rightarrow$ When a match is found, rewrite the expression. $\rightarrow$ Halt if there are no more rules that match.
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE 1 = 0;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE 1 = 0
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
SELECT * FROM A WHERE NOW() IS NULL;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
SELECT * FROM A WHERE NOW() IS NULL;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
SELECT * FROM A WHERE false;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false; SELECT * FROM A WHERE false;
Merging Predicates
SELECT * FROM A WHERE val BETWEEN 1 AND 100 OR val BETWEEN 50 AND 150;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false; SELECT * FROM A WHERE false;
Merging Predicates
SELECT * FROM A WHERE val BETWEEN 1 AND 100 OR val BETWEEN 50 AND 150;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
SELECT * FROM A WHERE false;
Merging Predicates
SELECT * FROM A WHERE val BETWEEN 1 AND 150;
OBSERVATION
We have formulas for the operator algorithms (e.g. the cost formulas for hash join, sort merge join,...), but we also need to estimate the size of the output that an operator produces.
OBSERVATION
We have formulas for the operator algorithms (e.g. the cost formulas for hash join, sort merge join,...), but we also need to estimate the size of the output that an operator produces.
OBSERVATION
We have formulas for the operator algorithms (e.g. the cost formulas for hash join, sort merge join,...), but we also need to estimate the size of the output that an operator produces.
This is hard because the output of each operators depends on its input.
COST ESTIMATION
The DBMS uses a cost model to predict the behavior of a query plan given a database state. $\rightarrow$ This is an internal cost that allows the DBMS to compare one plan with another.
It is too expensive to run every possible plan to determine this information, so the DBMS need a way to derive this information.
COST MODEL COMPONENTS
Choice #1: Physical Costs
$\rightarrow$ Predict CPU cycles, I/O, cache misses, RAM consumption, network messages... $\rightarrow$ Depends heavily on hardware.
Choice #2: Logical Costs
$\rightarrow$ Estimate output size per operator. $\rightarrow$ Independent of the operator algorithm. $\rightarrow$ Need estimations for operator result sizes.
POSTGRES COST MODEL
Uses a combination of CPU and I/O costs that are weighted by "magic" constant factors.
Default settings are obviously for a disk- resident database without a lot of memory: $\rightarrow$ Processing a tuple in memory is 400x faster than reading a tuple from disk. $\rightarrow$ Sequential I/O is 4x faster than random I/O.
19.7.2. Planner Cost Constants
The cost variables described in this section are measured on an arbitrary scale. Only their relative values matter, hence scaling them all up or down by the same factor will result in no change in the planner's choices. By default, these cost variables are based on the cost of sequential page fetches; that is, seq_page_cost is conventionally set to 1.0 and the other cost variables are set with reference to that. But you can use a different scale if you prefer, such as actual execution times in milliseconds on a particular machine.
Note: Unfortunately, there is no well- defined method for determining ideal values for the cost variables. They are best treated as averages over the entire mix of queries that a particular installation will receive. This means that changing them on the basis of just a few experiments is very risky.
seq_page_cost (floating point)
Sets the planner's estimate of the cost of a disk page fetch that is part of a series of sequential fetches. The default is 1.0. This value can be overridden for tables and indexes in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).
random_page_cost (floating point)
STATISTICS
The DBMS stores internal statistics about tables, attributes, and indexes in its internal catalog. Different systems update them at different times.
Manual invocations:
$\rightarrow$ Postgres/SQLite: ANALYZE $\rightarrow$ Oracle/MySQL: ANALYZE TABLE $\rightarrow$ SQL Server: UPDATE STATISTICS $\rightarrow$ DB2: RUNSTATS
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurences/|R|
SELECT * FROM people WHERE age = 9
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurences/|R|
SELECT * FROM people WHERE age = 9
[ImageCaption: age]
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurrences/|R|
SELECT × FROM people WHERE age = 9
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurences/|R| $\rightarrow$ Example: sel(age=9) =
SELECT × FROM people WHERE age = 9
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurences/|R| $\rightarrow$ Example: sel(age=9) =
SELECT × FROM people WHERE age = 9
SELECTION CARDINALITY
The selectivity (sel) of a predicate P is the fraction of tuples that qualify.
Equality Predicate: A=constant $\rightarrow$ sel(A=constant) = #occurences/|R| $\rightarrow$ Example: sel(age=9) = 4/45
SELECT × FROM people WHERE age = 9
SELECTION CARDINALITY
Assumption #1: Uniform Data
$\rightarrow$ The distribution of values (except for the heavy hitters) is the same.
Assumption #2: Independent Predicates
$\rightarrow$ The predicates on attributes are independent
Assumption #3: Inclusion Principle
$\rightarrow$ The domain of join keys overlap such that each key in the inner relation will also exist in the outer table.
CORRELATED ATTRUTES
Consider a database of automobiles:
$\rightarrow$ # of Makes = 10, # of Models = 100
And the following query: $\rightarrow$ (make="Honda" AND model="Accord")
With the independence and uniformity assumptions, the selectivity is:
$$
\rightarrow 1 / 10\times 1 / 100 = 0.001
$$
But since only Honda makes Accords the real selectivity is $1 / 100 = 0.01$
STATISTICS
Choice #1: Histograms
$\rightarrow$ Maintain an occurrence count per value (or range of values) in a column.
Choice #2: Sketches
$\rightarrow$ Probabilistic data structure that gives an approximate count for a given value.
Choice #3: Sampling
$\rightarrow$ DBMS maintains a small subset of each table that it then uses to evaluate expressions to compute selectivity.
HISTOGRAMS
Our formulas are nice, but we assume that data values are uniformly distributed.
HISTOGRAMS
Our formulas are nice, but we assume that data values are uniformly distributed.
HISTOGRAMS
Our formulas are nice, but we assume that data values are uniformly distributed.
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of each unique key. All buckets have the same width (i.e., same # of value).
[ImageCaption: Non-Uniform Approximation]
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of each unique key. All buckets have the same width (i.e., same # of value).
[ImageCaption: Non-Uniform Approximation]
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of each unique key. All buckets have the same width (i.e., same # of value).
[ImageCaption: Equi-Width Histogram]
EQUI-DEPTH HISTOGRAMS
Vary the width of buckets so that the total number of occurrences for each bucket is roughly the same.
EQUI-DEPTH HISTOGRAMS
Vary the width of buckets so that the total number of occurrences for each bucket is roughly the same.
EQUI-DEPTH HISTOGRAMS
Vary the width of buckets so that the total number of occurrences for each bucket is roughly the same.
SKETCHES
Probabilistic data structures that generate approximate statistics about a data set.
Cost- model can replace histograms with sketches to improve its selectivity estimate accuracy.
Most common examples:
$\rightarrow$ Count- Min Sketch (1988): Approximate frequency count of elements in a set. $\rightarrow$ HyperLogLog (2007): Approximate the number of distinct elements in a set.
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
Table Sample
Table (html):
| 1001 | Obama | 63 | Rested |
| 1003 | Tupac | 25 | Dead |
| 1005 | Andy | 43 | Illin |
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
Table Sample
sel(age>50) =
Table (html):
| 1001 | Obama | 63 | Rested |
| 1003 | Tupac | 25 | Dead |
| 1005 | Andy | 43 | Illin |
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
Table Sample
sel(age>50) =
Table (html):
| 1001 | Obama | 63 | Rested |
| 1003 | Tupac | 25 | Dead |
| 1005 | Andy | 43 | Illin |
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
SAMPLING
Modern DBMSs also collect samples from tables to estimate selectivities.
Update samples when the underlying tables changes significantly.
Table Sample
sel(age>50) = 1/3
Table (html):
| 1001 | Obama | 63 | Rested |
| 1003 | Tupac | 25 | Dead |
| 1005 | Andy | 43 | Illin |
SELECT AVG(age) FROM people WHERE age > 50
Table (html):
| id | name | age | status |
| 1001 | Obama | 63 | Rested |
| 1002 | Swift | 34 | Paid |
| 1003 | Tupac | 25 | Dead |
| 1004 | Bieber | 30 | Crunk |
| 1005 | Andy | 43 | Illin |
| 1006 | TigerKing | 61 | Jailed |
1 billion tuples
CONCLUSION
Query optimization is critical for a database system. $\rightarrow$ SQL $\rightarrow$ Logical Plan $\rightarrow$ Physical Plan $\rightarrow$ Flatten queries before going to the optimization part. Expression handling is also important. $\rightarrow$ Estimate costs using models based on summarizations.
QO enumeration can be bottom- up or top- down.
NEXT CLASS
Transactions! $\rightarrow$ aka the second hardest part about database systems