GithubDocuments
    1. Marketplace
    2. CMU Database - 15445 - 2025 Spring
    Collections
    CMU Database - 15445 - 2025 Spring
    cmu_database
    This collection of documents, "CMU Database - 15445 - 2025 Spring," provides a comprehensive overview of database systems, primarily focusing on the design, implementation, and ...
    DocumentsKnowledge Graph
    lecture-07-slides.pdf
    Carnegie Mellon University Database Systems Hash Tables

    ADMINISTRIVIA

    Project #1 is due on February 9th @ 11
    Homework #2 is due February 9th @ 11

    COURSE OUTLINE

    We are now going to talk about how to support the DBMS's execution engine to read/write data from pages.
    Two types of data structures: $\rightarrow$ Hash Tables (Unordered) $\rightarrow$ Trees (Ordered)
    Query Planning
    Operator Execution
    Access Methods
    Buffer Pool Manager
    Disk Manager

    TODAY'S AGENDA

    Background Hash Functions Static Hashing Schemes Dynamic Hashing Schemes DB Flash Talk: DataStax

    DATA STRUCTURES

    Internal Meta- dataCore Data StorageTemporary Data StructuresTable Indexes

    DESIGN DECISIONS

    Data Organization

    $\rightarrow$ How we layout data structure in memory/pages and what information to store to support efficient access.

    Concurrency

    $\rightarrow$ How to enable multiple threads to access the data structure at the same time without causing problems.

    HASH TABLES

    A hash table implements an unordered associative array that maps keys to values.
    It uses a hash function to compute an offset into this array for a given key, from which the desired value can be found.
    Space Complexity: O(n)Time Complexity: $\rightarrow$ Average: O(1) $\leftarrow$ Databases care about constants! $\rightarrow$ Worst: O(n)

    STATIC HASH TABLE

    Allocate a giant array that has one slot for every element you need to store.
    To find an entry, mod the key by the number of elements to find the offset in the array.
    hash(key) % N

    STATIC HASH TABLE

    Allocate a giant array that has one slot for every element you need to store.
    To find an entry, mod the key by the number of elements to find the offset in the array.
    hash(key) % N

    STATIC HASH TABLE

    Allocate a giant array that has one slot for every element you need to store.
    To find an entry, mod the key by the number of elements to find the offset in the array.
    hash(key) % N

    UNREALISTIC ASSUMPTIONS

    Assumption #1: Number of elements is known ahead of time and fixed.
    Assumption #2: Each key is unique.
    Assumption #3: Perfect hash function guarantees no collisions. $\rightarrow$ If key1≠key2, then hash(key1)≠hash(key2)
    hash(key)% N

    HASH TABLE

    Design Decision #1: Hash Function

    $\rightarrow$ How to map a large key space into a smaller domain. $\rightarrow$ Trade- off between being fast vs. collision rate.

    Design Decision #2: Hashing Scheme

    $\rightarrow$ How to handle key collisions after hashing. $\rightarrow$ Trade- off between allocating a large hash table vs. additional instructions to get/put keys.

    HASH FUNCTIONS

    For any input key, return an integer representation of that key.
    $\rightarrow$ Converts arbitrary byte array into a fixed- length code.
    We want something that is fast and has a low collision rate.
    We do not want to use a cryptographic hash function for DBMS hash tables (e.g., SHA- 2).

    HASH FUNCTIONS

    CRC-64 (1975)

    $\rightarrow$ Used in networking for error detection.

    MurmurHash (2008)

    $\rightarrow$ Designed as a fast, general- purpose hash function.

    Google CityHash (2011)

    $\rightarrow$ Designed to be faster for short keys (<64 bytes).

    Facebook XXHash (2012)

    $\rightarrow$ From the creator of zstd compression.

    Google FarmHash (2014)

    $\rightarrow$ Newer version of CityHash with better collision rates.

    HASH FUNCTIONS

    CRC-64 (1975)

    $\rightarrow$ Used in networking for error detection.

    MurmurHash (2008)

    $\rightarrow$ Designed as a fast, general- purpose hash function.

    Google CityHash (2011)

    $\rightarrow$ Designed to be faster for short keys (<64 bytes).

    Facebook XXHash (2012)

    $\rightarrow$ From the creator of zstd compression.
    State- of- the- art

    Google FarmHash (2014)

    $\rightarrow$ Newer version of CityHash with better collision rates.

    HASH FUNCTIONS

    smhasher

    SMhasher

    Table (html):
    Hash functionMiB/seccycl./hashcycl./mapsizeQuality problems
    donnothing3211149460.064.00-13bad seed 0, test NOP
    donnothing6411787676.424.00-13bad seed 0, test NOP
    donnothing12811745060.764.06-13bad seed 0, test NOP
    NOP_OAAT_read6411372846.3714.00-47test NOP
    BadHash769.9473.97-47bad seed 0, test FAIL
    sumhash10699.5729.53-363bad seed 0, test FAIL
    sumhash3242877.7923.12-863UB, test FAIL
    multiply_shift8026.7726.05226.80 (8)345bad seeds &amp; 0xfliff0, fails most tests
    pair_multiply_shift3716.9540.22186.34 (3)609fails most tests
    crc32383.12134.21257.50 (11)422insecure, 8590x collisions, distrib, PerlinNoise
    md5_32350.53644.31894.12 (10)4419
    [TableCaption: Linux Build status building building]
    n.
    tate- of- the- art
    rates.

    HASH FUNCTIONS

    smhasher

    SMhasher

    Linux Build status building building failing
    Table (html):
    Hash functionMiB/seccycl./hashcycl./mapsize
    donnothing3211149460.064.00-13
    donnothing6411787676.424.00-13
    donnothing12811745060.764.06-13
    NOP_OAAT_read6411372846.3714.00-4
    BadHash769.9473.97-4
    sumhash10699.5729.53-36
    sumhash3242877.7923.12-8
    multiply_shift8026.7726.05226.80 (8)3
    pair_multiply_shift3716.9540.22186.34 (3)6
    crc32383.12134.21257.50 (11)4
    md5_32350.53644.31894.12 (10)4

    Summary

    I added some SSE assisted hashes and fast intel/arm CRC32- C, AES and SHA HW variants. See also the old https://github.com/aapplebv/smhasher/wiki, the improved, but unmaintained fork https://github.com/demerphq/smhasher, and the new improved version SMHasher3 https://gitlab.com/fwojcik/smhasher3. So the fastest hash functions on x86_64 without quality problems are:
    rapidhash (an improved wyhash) - xhx3low wyhash - umash (even universal!) - ahash64 - t1ha2_atonce komihash FarmHash (not portable, too machine specific: 64 vs 32bit, old gcc, ...) - halftime_hash128 - Spooky32 pengyhash - nmhash32 - mx3 - MUM/mir (different results on 32/64- bit archs, lots of bad seeds to filter out) - fasthash32

    STATIC HASHING SCHEMES

    Approach #1: Linear Probe Hashing Approach #2: Cuckoo Hashing
    There are several other schemes covered in the Advanced DB course:
    $\longrightarrow$ Robin Hood Hashing $\longrightarrow$ Hopscotch Hashing $\longrightarrow$ Swiss Tables

    STATIC HASHING SCHEMES

    Approach #1: Linear Probe Hashing Approach #2: Cuckoo Hashing
    Open Addressing
    There are several other schemes covered in the Advanced DB course:
    $\longrightarrow$ Robin Hood Hashing $\longrightarrow$ Hopscotch Hashing $\longrightarrow$ Swiss Tables

    LINEAR PROBE HASHING

    Single giant table of fixed- length slots.
    Resolve collisions by linearly searching for the next free slot in the table.
    $\rightarrow$ To determine whether an element is present, hash to a location in the table and scan for it. $\rightarrow$ Store keys in table to know when to stop scanning. $\rightarrow$ Insertions and deletions are generalizations of lookups.
    The table's load factor determines when it is becoming too full and should be resized. $\rightarrow$ Allocate a new table twice as large and rehash entries.

    LINEAR PROBE HASHING

    hash(key)% N
    A B C D E F

    LINEAR PROBE HASHING

    hash(key)% N
    A B C D E F

    LINEAR PROBE HASHING

    hash(key)% N
    A B C D E F

    LINEAR PROBE HASHING

    LINEAR PROBE HASHING

    hash(key)% N

    LINEAR PROBE HASHING

    hash(key)% N

    LINEAR PROBE HASHING

    hash(key)% N

    LINEAR PROBE HASHING

    hash(key)% N

    LINEAR PROBE HASHING

    hash(key)% N

    LINEAR PROBE HASHING

    hash(key)% N

    HASH TABLE – KEY/VALUE ENTRIES

    Fixed-length Key/Values:

    $\rightarrow$ Store inline within the hash table pages. $\rightarrow$ Optional: Store the key's hash with the key for faster comparisons.

    Variable-length Key/Values:

    $\rightarrow$ Insert key/value data in separate a private temporary table. $\rightarrow$ Store the hash as the key and use the record id pointing to its corresponding entry in the temporary table as the value.

    HASH TABLE - KEY/VALUE ENTRIES

    Fixed-length Key/Values:

    $\rightarrow$ Store inline within the hash table pages. $\rightarrow$ Optional: Store the key's hash with the key for faster comparisons.

    Variable-length Key/Values:

    $\rightarrow$ Insert key/value data in separate a private temporary table. $\rightarrow$ Store the hash as the key and use the record id pointing to its corresponding entry in the temporary table as the value.
    Table (html):
    hashkeyvalue
    hashkeyvalue
    hashkeyvalue

    HASH TABLE - KEY/VALUE ENTRIES

    Fixed-length Key/Values:

    $\rightarrow$ Store inline within the hash table pages. $\rightarrow$ Optional: Store the key's hash with the key for faster comparisons.

    Variable-length Key/Values:

    $\rightarrow$ Insert key/value data in separate a private temporary table. $\rightarrow$ Store the hash as the key and use the record id pointing to its corresponding entry in the temporary table as the value.
    Table (html):
    hashkeyvalue
    hashkeyvalue
    hashkeyvalue
    Table (html):
    hashRecordId
    hashRecordId
    hashRecordId
    Temp Table Page
    Table (html):
    key | value
    key | value
    key | value

    LINEAR PROBE HASHING - DELETES

    LINEAR PROBE HASHING - DELETES

    LINEAR PROBE HASHING - DELETES

    LINEAR PROBE HASHING - DELETES

    hash(key)% N
    A B C D E F

    LINEAR PROBE HASHING - DELETES

    LINEAR PROBE HASHING - DELETES

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot.

    LINEAR PROBE HASHING - DELETES

    Approach #1: Movement $\rightarrow$ Rehash keys until you find the first empty slot. $\rightarrow$ Expensive! May need to reorganize the entire table. $\rightarrow$ No DBMS does this.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    hash(key)% N
    A B C D E F
    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    hash(key)% N
    A B C D E F
    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    LINEAR PROBE HASHING - DELETES

    Approach #2: Tombstone $\rightarrow$ Maintain separate bit map to indicate that the entry in the slot is logically deleted. $\rightarrow$ Reuse the slot for new keys. $\rightarrow$ May need periodic garbage collection.

    HASH TABLE – NON-UNIQUE KEYS

    Choice #1: Separate Linked List

    $\longrightarrow$ Store values in separate storage area for each key. $\longrightarrow$ Value lists can overflow to multiple pages if the number of duplicates is large.

    HASH TABLE – NON-UNIQUE KEYS

    Choice #1: Separate Linked List

    $\rightarrow$ Store values in separate storage area for each key. $\rightarrow$ Value lists can overflow to multiple pages if the number of duplicates is large.

    Choice #2: Redundant Keys

    $\rightarrow$ Store duplicate keys entries together in the hash table. $\rightarrow$ This is what most systems do.

    OPTIMIZATIONS

    Specialized hash table implementations based on key type(s) and sizes. $\rightarrow$ Example: Maintain multiple hash tables for different string sizes for a set of keys.

    OPTIMIZATIONS

    Specialized hash table implementations based on key type(s) and sizes. $\longrightarrow$ Example: Maintain multiple hash tables for different string sizes for a set of keys.
    Store metadata separate in a separate array. $\longrightarrow$ Packed bitmap tracks whether a slot is empty/tombstone.

    OPTIMIZATIONS

    Specialized hash table implementations based on key type(s) and sizes. $\longrightarrow$ Example: Maintain multiple hash tables for different string sizes for a set of keys.
    Store metadata separate in a separate array. $\longrightarrow$ Packed bitmap tracks whether a slot is empty/tombstone.
    Use table + slot versioning metadata to quickly invalidate all entries in the hash table. $\longrightarrow$ Example: If table version does not match slot version, then treat the slot as empty.

    OPTIMIZAT

    Specialized hash table implementype(s) and sizes. $\longrightarrow$ Example: Maintain multiple hash sizes for a set of keys.
    Store metadata separate in a s $\longrightarrow$ Packed bitmap tracks whether a
    Use table + slot versioning m invalidate all entries in the ha $\longrightarrow$ Example: If table version does n treat the slot as empty.

    Hash tables in ClickHouse and C++ Zero-cost Abstractions

    Introduction

    Hash tables are the diva of data structures. No other data structure offers the same opportunity for bugs to be introduced during optimization. In this blog post, we explore how hash tables are used in ClickHouse. We'll show how low cost abstractions work in modern C+ and how to get a variety of data structures from a common codebase with a few little tricks.

    CUCKOO HASHING

    Use multiple hash functions to find multiple locations in the hash table to insert records. $\rightarrow$ On insert, check multiple locations and pick the one that is empty. $\rightarrow$ If no location is available, evict the element from one of them and then re- hash it find a new location.
    Look- ups and deletions are always O(1) because only one location per hash table is checked.
    Best open- source implementation is from CMU.

    CUCKOO HASHING

    CUCKOO HASHING

    Put A: hash $_1$ (A) hash $_2$ (A)

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    Put A: hash(A) hash(A) Put B: hash(B) hash(B)

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    CUCKOO HASHING

    OBSERVATION

    The previous hash tables require the DBMS to know the number of elements it wants to store. $\rightarrow$ Otherwise, it must rebuild the table if it needs to grow/shrink in size.
    Dynamic hash tables incrementally resize themselves as needed. $\rightarrow$ Chained Hashing $\rightarrow$ Extendible Hashing $\rightarrow$ Linear Hashing

    CHAINED HASHING

    Maintain a linked list of buckets for each slot in the hash table.
    Resolve collisions by placing all elements with the same hash key into the same bucket. $\longrightarrow$ To determine whether an element is present, hash to its bucket and scan for it. $\longrightarrow$ Insertions and deletions are generalizations of lookups.

    CHAINED HASHING

    hash(key)% N

    CHAINED HASHING

    hash(key)% N
    Bucket Pointers
    Buckets

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    hash(key) % N Put A Put B Put C Put D Put E
    Bucket Pointers
    B|value
    A|value C|value

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    CHAINED HASHING

    EXTENDIBLE HASHING

    Chained- hashing approach that splits buckets incrementally instead of letting the linked list grow forever.
    Multiple slot locations can point to the same bucket chain.
    Reshuffle bucket entries on split and increase the number of bits to examine. $\rightarrow$ Data movement is localized to just the split chain.

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    Max number of bits to examine in hashes

    EXTENDIBLE HASHING

    Max number of bits to examine in hashes

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. .. Put B hash(B) = 10111. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. .. Put B hash(B) = 10111. ..

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    EXTENDIBLE HASHING

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    EXTENDIBLE HASHING

    Get A hash(A) = 01110. ..
    Put B hash(B) = 10111. ..
    Put C hash(C) = 10100. ..

    LINEAR HASHING

    The hash table maintains a pointer that tracks the next bucket to split. $\rightarrow$ When any bucket overflows, split the bucket at the pointer location. Use multiple hashes to find the right bucket for a given key.
    Can use different overflow criterion: $\rightarrow$ Space Utilization $\rightarrow$ Average Length of Overflow Chains

    LINEAR HASHING

    LINEAR HASHING

    LINEAR HASHING

    LINEAR HASHING

    Get 6 $hash_{1}(6) = 6% 4 = 2$

    LINEAR HASHING

    Get 6 $hash_{1}(6) = 6% 4 = 2$
    Put 17 $hash_{1}(17) = 17% 4 = 1$

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1 hash_28=8%8=0

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1 hash_8=8%8=0 hash_20=20%8=4

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1 hash_8=8%8=0 hash_20=20%8=4

    LINEAR HASHING

    Get 6 hash_6= 6%4=2
    Put 17 hash_17= 17%4=1 hash_8=8%8=0 hash_20=20%8=4

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1 hash(8)= 8 % 8 = 0 hash(20)= 20 % 8 = 4
    Get 20 hash(20)= 20 % 4 = 0

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1 hash(8)= 8 % 8 = 0 hash(20)= 20 % 8 = 4
    Get 20 hash(20)= 20 % 4 = 0

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1 hash(8)= 8 % 8 = 0 hash(20)= 20 % 8 = 4
    Get 20 hash(20)= 20 % 4 = 0 hash(20)= 20 % 8 = 4

    LINEAR HASHING

    Get 6 hash(6)= 6 % 4 = 2
    Put 17 hash(17)= 17 % 4 = 1 hash(8)= 8 % 8 = 0 hash(20)= 20 % 8 = 4
    Get 20 hash(20)= 20 % 4 = 0 hash(20)= 20 % 8 = 4
    Get 9 hash(9)= 9 % 4 = 1

    LINEAR HASHING

    Get 6 hash1(6)= 6 % 4 = 2
    Put 17 hash1(17)= 17 % 4 = 1 hash2(8)= 8 % 8 = 0 hash2(20)= 20 % 8 = 4
    Get 20 hash1(20)= 20 % 4 = 0 hash2(20)= 20 % 8 = 4
    Get 9 hash1(9)= 9 % 4 = 1

    LINEAR HASHING - RESIZING

    Splitting buckets based on the split pointer will eventually get to all overflowed buckets.
    $\rightarrow$ When the pointer reaches the last slot, remove the first hash function and move pointer back to beginning.
    If the "highest" bucket below the split pointer is empty, the hash table could remove it and move the splinter pointer in reverse direction.

    LINEAR HASHING - DELETES

    LINEAR HASHING - DELETES

    Delete 20 hash(20) = 20 % 4 = 0

    LINEAR HASHING - DELETES

    Delete 20 hash,(20)= 20 % 4 = 0

    LINEAR HASHING - DELETES

    Delete 20 $\begin{array}{r}hash_{1}(20) = 20% 4 = 0\ hash_{2}(20) = 20% 8 = 4 \end{array}$

    LINEAR HASHING - DELETES

    Delete 20 $\begin{array}{r}hash_{1}(20) = 20% 4 = 0\ hash_{2}(20) = 20% 8 = 4 \end{array}$

    LINEAR HASHING - DELETES

    Delete 20 hash,(20)= 20 % 4 = 0 hash,(20)= 20 % 8 = 4

    LINEAR HASHING - DELETES

    Delete 20 $\begin{array}{r}hash_{1}(20) = 20% 4 = 0\ hash_{2}(20) = 20% 8 = 4 \end{array}$

    LINEAR HASHING - DELETES

    Delete 20 $\begin{array}{r}hash_{1}(20) = 20% 4 = 0\ hash_{2}(20) = 20% 8 = 4 \end{array}$

    LINEAR HASHING - DELETES

    Delete 20 $\begin{array}{r}hash_{1}(20) = 20% 4 = 0\ hash_{2}(20) = 20% 8 = 4 \end{array}$

    LINEAR HASHING - DELETES

    Delete 20 hash,(20)= 20 % 4 = 0 hash,(20)= 20 % 8 = 4
    Put 21 hash,(21)= 21 % 4 = 1

    CONCLUSION

    Fast data structures that support O(1) look- ups that are used all throughout DBMS internals. $\longrightarrow$ Trade- off between speed and flexibility.
    Hash tables are usually not what you want to use for a table index...

    CONCLUSION

    Fast data structures that support O(1) look- ups that are used all throughout DBMS internals. $\rightarrow$ Trade- off between speed and flexibility.
    Hash tables are usually not what you want to use for a table index...
    CREATE INDEX ON xxx (val);

    CONCLUSION

    Fast data structures that support O(1) look- ups that are used all throughout DBMS internals. $\rightarrow$ Trade- off between speed and flexibility.
    Hash tables are usually not what you want to use for a table index...

    CONCLUSION

    Fast data structures that support O(1) look- ups that are used all throughout DBMS internals. $\longrightarrow$ Trade- off between speed and flexibility.
    Hash tables are usually not what you want to use for a table index...

    Order-Preserving Indexes ft. B+Trees $\rightarrow$ aka "The Greatest Data Structure of All Time"