Data Structures
Quick refresher: 8 Key Data Structures That Power Modern Databases by ByteByteGo
Hash Index
How do Hash Indexes work? by Jordan has no live
A Hash Index setup consists of two main components: segment files and a HashMap.
- Segment Files These files store the actual data values.
- HashMap The HashMap maintains a mapping from each key to the byte offset of its corresponding row in a segment file.
- Updates and Deletions When data is updated or deleted, the system writes a new version of the data to a segment file and updates the offset in the HashMap to reference the new version. Because this process does not remove old versions, the system periodically needs to perform a garbage collection procedure known as compaction.
- Compaction During compaction the system takes all existing segment files (or a subset of them, based on certain criteria) and merges their contents into a single new segment file. It reads the segments from oldest to newest, discarding any obsolete entries and keeping only the latest version for each key. This ensures that the new segment file contains only the most relevant data. After the new segment file is written, the indexing structure (HashMap) is updated to point to the offsets in this new file, and all old segment files are deleted to reclaim storage space.
Sorted String Table
- In-Memory Storage The system stores and updates data in a tree structure in memory, keeping it sorted by key. Reads and writes on this tree take O(log n).
- Flushing to Disk When the in-memory tree grows beyond a predefined size limit, the system writes (flushes) it to a segment file on disk, preserving the key order. After this flush, the system creates a new, empty tree in memory for subsequent writes.
- Read Operations On a read request, the system first checks the in-memory tree. If the data is not found there, it searches the segment files, starting with the most recent one.
- Binary Search Because each segment file maintains sorted keys, the system can perform binary searches in O(log n) to find the requested data.
- Sparse Index To further optimize searches, the system can use a sparse index. This is a structure containing “shortcuts” (key pointers) into the segment file, allowing faster positioning before performing the final binary search.
- Compaction When multiple segment files accumulate, the system performs compaction by reading them all simultaneously and merging their contents into a new segment file, using an algorithm similar to merge-sort. This process discards any outdated entries and preserves only the latest data for each key.
- Comparison with B-Tree LSM tree is usually faster on write but slower on a reads than a B-Tree. Due to the process of compaction LSM trees have effect of write amplification, when one write on average causes several writes on a disk. Sometimes, it is possible that writes are happening with higher rate than compaction could process, which could lead to dramatic decrease in read performance.
B-Tree
A B-tree is a balanced, multi-way search tree designed for systems where data is frequently read from and written to slower storage (e.g., disks). Each node (except possibly the root) has a minimum and maximum range of children, ensuring the tree remains shallow and balanced.
- Search Searches in a B-tree proceed by comparing the target key to the keys stored in a node, then following the appropriate child pointer. Since the height is strictly controlled, lookups typically take O(log n) time.
- Insertion Insertion follows the usual search path to find the appropriate node. If inserting a key causes that node to exceed its capacity, the node is split roughly in half, and a key is “pushed up” to the parent. This splitting process may cascade up if the parent becomes overfull as well.
- Deletion Deletion also locates the key via a standard search path. If removing the key causes a node to drop below its minimum occupancy, it is merged or redistributes keys with a sibling node. If the merge causes the parent to lose too many keys, this deficit can propagate upward in a manner similar to insertion splits.
- Balance and HeightBecause every node except the root must meet strict minimum occupancy rules, the tree remains well balanced. This property keeps the height small and ensures predictable performance for insertion, deletion, and lookup operations.
In practice, B-trees are often adapted with additional optimizations (e.g., buffering, different node layouts) for specific storage engines or file systems, but the fundamental algorithms remain as above.
One of the examples would be Clustered Index, which is a practice of keeping the value right in the index, instead of just pointer to a file.
R-Tree
TBD
Bloom Filters
TBD
OLAP
//TODO: Move to separate file
Column Oriented Storage
- Data is split by columns, and rows of these columns are stored together.
- Columnar segments stored on disk in blocks.
- Columns usually have similar values, which allows for better compressing.
- Can provide fast aggregations such as sum(), avg(), count(), by utilisation of data cubes
- Allows efficient Vectorized Processing – Optimized for modern CPU architectures.