Architecture
Architecture
A 26-crate layered workspace where every module is independently testable, versioned, and documented. Six architectural layers from foundation to integration.
Architectural Layers
Foundation
Core type definitions, structured error types, and the virtual filesystem abstraction layer. The bedrock that every other layer depends on.
Storage
B-tree storage engine with
and . page cache with . Write-ahead logging () with per-writer lanes and checkpointing.Concurrency & Durability
Page-level
with validation via the . version chains for compact storage. for lock-free commit sequencing. regime detection auto-tunes GC heuristics. transaction timelines.SQL Engine
Hand-written recursive descent parser, full AST, cost-based query planner with join ordering, and
bytecode interpreter. in the AST layer enables safe transaction replay. 150+ built-in scalar, aggregate, and window functions.Extensions
FTS3/FTS4 and FTS5 full-text search, ICU collation, JSON1 functions and virtual tables, R-tree spatial indexing, session/changeset support, and miscellaneous extensions.
Integration
Public API facade, C API compatibility shim, interactive SQL shell with syntax highlighting.
conformance harness with performance bounds.MVCC Deep Dive
Snapshot Isolation
Each transaction captures a
of the database at its start time: a consistent, frozen view of all . Writers create versions of modified pages and merge them at commit time via . Readers are never blocked and never see uncommitted data.Concurrent Writers
C SQLite allows exactly one writer at a time. FrankenSQLite supports up to 8 concurrent writers operating on their own
. The layer manages version chains for each page, and background garbage collection reclaims versions that are no longer visible to any active transaction, keeping memory bounded.Time-Travel Queries
let you inspect the database at any past commit point usingFOR SYSTEM_TIME AS OF with a commit sequence number or timestamp. Because version chains preserve old page states, the engine can reconstruct any historical on demand. No forks, no replicas, no manual backup rotation required.Version Cleanup
A background vacuum reclaims space from
versions no longer visible to any active transaction. monitors throughput patterns in real time and auto-tunes garbage collection thresholds when the workload shifts between OLTP bursts, bulk loads, and idle periods. No manual tuning required.Single-Threaded Write Coordinator
Multi-threaded disk I/O typically requires complex locking protocols and two-phase commit. FrankenSQLite takes a different approach: slow
modifications run in parallel across many worker threads, but the actual commit validation and appends are funneled through a single, lock-free pipeline. This maximizes sequential SSD write bandwidth while avoiding the contention that plagues mutex-based commit paths.The visualization below shows worker threads producing
page diffs in parallel, then feeding them through the coordinator's validation, append, and flush stages. Press Run Pipeline to watch the pipeline in action.The Safe Merge Ladder
When two transactions modify the same
, most databases abort one immediately. FrankenSQLite's tries four progressively stronger resolution strategies before resorting to abort. Each rung handles a wider class of conflicts than the one above it. The conflict ladder visualization below lets you walk through three scenarios, from non-conflicting to true conflict, and see exactly which rung resolves each case.Intent Replay
Canonical reordering of independent operations. If two writes don't interfere, they can be merged into a single consistent result.
Byte-level diff combination. If the two deltas don't overlap at the byte level, their XOR produces a valid merged page.
Abort
Only as a last resort. The losing transaction is rolled back and retried. This is the only strategy that matches traditional database behavior.
RaptorQ Self-Healing
Encode
Every time a
is written, generate redundant over arithmetic. These symbols are stored sequentially alongside the data in the with configurable overhead (typically 20%).Detect
On every page read, BLAKE3 checksums verify integrity. If corruption is detected, whether from bit rot, disk controller errors, or cosmic rays, the recovery pipeline activates automatically with no operator intervention.
Recover
reconstructs corrupted data from the surviving . The peeling decoder handles most cases; Gaussian elimination finishes the rest. Recovery requires just 2 extra symbols beyond the source block count. Click pages below to simulate corruption and watch the engine rebuild them.Erasure-Coded Streams
The
format is FrankenSQLite's native storage mode. It replaces the traditional in-place update model with an append-only sequence of page versions, each protected by .The
places raw source data first in every block, so normal reads are zero-copy with no decoding overhead. Decoding activates only when corruption is detected. Step through the visualization below to see how a raw 4 KB page is partitioned into source symbols, encoded with , and then recovered after simulated corruption.Compact Version Storage
version chains grow with every write. Storing a full 4 KB copy of a for every single transaction would bloat the database rapidly. compress these chains by storing only the bytes that actually changed between successive page versions.When less than 25% of a page changes (the common case for point updates), the engine stores a sparse delta instead of a full copy, saving up to 93% of storage per version. When a page changes substantially, a full snapshot is stored and the delta chain resets. Step through below to see the XOR computation, the sparse delta, and the threshold-based cutoff in action.
Adaptive Replacement Cache
A single SELECT * table scan can evict the entire LRU working set, forcing the buffer pool to re-read frequently-accessed pages from disk. FrankenSQLite replaces LRU with an
On top of ARC, the
adds a grace period: pages must survive an entire cooling cycle without being re-accessed before they become eviction candidates. Hot interior nodes use to resolve in-memory addresses directly, bypassing the cache lookup entirely. The result: sequential scans no longer destroy your working set. Try accessing pages below to see how the four lists interact and how ghost lists influence future promotion decisions.Varint Encoding
SQLite compresses row IDs, record header sizes, and serial types using
: a Huffman-optimal, prefix-free code where small integers (the common case) use just 1 byte and the largest use 9. This saves substantial space across millions of records because most integers in a database are small.FrankenSQLite replicates this encoding exactly, byte for byte, maintaining full read/write compatibility with existing .sqlite3 files. Drag the slider below to watch how integers of different magnitudes map to different byte widths, and compare the varint representation against a fixed 8-byte layout.
Sheaf-Theoretic Consistency
In multi-process and high-concurrency settings, pairwise consistency checks miss a dangerous class of bugs: cases where no two transactions disagree with each other, yet the global state is inconsistent. Standard testing misses these because it only compares pairs.
FrankenSQLite's conformance harness uses a
to detect exactly these anomalies. Each transaction's local view (its “section” in the sheaf) must be globally compatible; if the sections cannot be glued into a single consistent state, the harness flags the violation. Step through below to see three transaction views tested for global consistency.Adaptive Workload Regimes
Static thresholds for
garbage collection and page compaction are inevitably wrong for at least one workload pattern. A threshold tuned for OLTP bursts wastes resources during idle periods; one tuned for bulk loads stalls under point-query traffic.FrankenSQLite uses
to detect workload regime shifts in real time. The algorithm maintains a running posterior over “run length” (how long the current regime has lasted) and triggers re-tuning when it detects a statistically significant shift. Start the live telemetry below, then switch between workload regimes to watch the detector identify transitions between OLTP, bulk-load, and idle throughput patterns.Exhaustive Concurrency Verification
Testing concurrent code with random fuzzing leaves you hoping you hit the right thread schedule. With N threads and M operations each, the number of possible interleavings grows factorially. Random sampling covers a vanishing fraction of the space.
FrankenSQLite uses
to group thread schedules that differ only in the ordering of independent (non-conflicting) operations into equivalence classes. Then tests exactly one schedule per class. This turns an infinite state space into a finite, exhaustively provable set. Step through below to see how three raw interleavings collapse into two equivalence classes, and why testing one representative from each class is sufficient.Anytime-Valid Invariant Monitoring
Traditional unit tests run once and stop. If a concurrency bug only manifests after 10 billion operations, a fixed test suite will never find it. Running more tests increases false-positive rates unless you apply Bonferroni correction, which reduces statistical power.
FrankenSQLite continuously monitors runtime invariants (like strict TxnId monotonicity and
Conformal Performance Bounds
Benchmark latency distributions have heavy tails, bimodal modes, and regime-dependent shapes. Reporting mean ± standard deviation assumes normality, which is almost never true. The result: regressions hide inside wide error bars, and improvements look significant when they are just noise.
FrankenSQLite uses
to establish rigorous, distribution-free confidence intervals around performance metrics. These bounds hold regardless of the underlying distribution, catching regressions that parametric methods miss. The visualization below shows how conformal intervals adapt to the actual data shape, tightening in stable regimes and widening during transitions.Lock-Free WAL Index
When
writers continuously append new page versions to the , readers need a fast way to find the most recent version of any page without blocking writers. A sequential scan of the WAL would be O(N); a tree-based index would require locks on every update.FrankenSQLite solves this with a memory-mapped
stored in the-shm shared memory file. It is a flat hash table using open addressing and linear probing with a load factor strictly capped at 0.5. Lookups resolve in O(1) expected time without a single lock acquisition or system call. Type a page number below and watch the hash function probe the table to find the WAL frame offset.Storage Modes
FrankenSQLite supports two storage modes. Compatibility mode reads and writes standard .sqlite3 files, so you can drop FrankenSQLite into an existing application with zero migration effort. Your data stays in the format every SQLite tool already understands.
Native
mode trades additional disk space for built-in self-healing, page versions, and append-only crash safety. Step through the comparison below to see how the same write operation flows through each mode and where they diverge.All 26 Crates
Public API facade
SQL abstract syntax tree node types
B-tree storage engine handling the fundamental
layoutSQLite C API compatibility shim for drop-in replacement
Interactive SQL shell
Core engine: connection, prepare, schema, DDL/DML codegen
End-to-end differential testing and benchmark harness
Structured error types
FTS3/FTS4 full-text search extension
FTS5 full-text search extension
ICU collation extension
JSON1 functions and virtual tables
Miscellaneous extensions: generate_series, carray, dbstat, dbpage
R-tree and geopoly spatial index extension
Session, changeset, and patchset extension
Built-in scalar, aggregate, and window functions
Conformance test runner and golden file comparison
Conflict analytics and observability infrastructure
Page cache and journal management
Hand-written recursive descent SQL parser
Query planner: name resolution, WHERE analysis, join ordering
Core type definitions
Virtual filesystem abstraction layer