← Back to Design & Development
Databases · Relational DBMS Compared

MySQL vs Postgres vs Oracle — And Everyone Else

Storage engines · MVCC · Replication · Isolation · JSON · Full-Text · Sharding · NewSQL — every relational database, side-by-side, with examples

01

The Family Tree — Why "All SQL Databases Are the Same" Is a Lie

Every relational database speaks SQL, supports ACID, and stores tables of rows and columns. From the outside they look interchangeable. Then you actually use two of them, hit your first UPSERT, your first JSON column, your first replication setup — and discover they disagree about almost everything underneath.

Priya joined a team that runs MySQL in production. Her last job ran Postgres. Her first task — "add a queue table" — is straightforward. She writes SELECT ... FOR UPDATE SKIP LOCKED, the way she always did. It crashes on MySQL 5.7. She checks the version: SKIP LOCKED arrived in 8.0. She upgrades, retries, and now the query runs — but inserts use AUTO_INCREMENT instead of SERIAL, JSON queries need a different operator, and her favorite RETURNING clause doesn't exist. Same SQL standard, two different worlds.
What's the actual difference between MySQL, Postgres, Oracle, and SQL Server?

They all started from the same relational model (Codd, 1970s) but evolved under wildly different incentives:

  • Oracle (1979) — built for enterprises that pay six-figure license fees. Optimized for the hardest workloads (mixed OLTP/OLAP), every feature in the book, infamous price tag.
  • MySQL (1995) — built to be the database of the LAMP web. Fast for simple reads, pluggable storage engines, "good enough" semantics. Now owned by Oracle.
  • PostgreSQL (1996) — built by academics who took correctness seriously. Extensible types, MVCC done right, modern features arrive here first. The "Linux of databases".
  • SQL Server (1989) — Microsoft's enterprise database, tightly integrated with .NET, Active Directory, and the Windows ecosystem. Now runs on Linux too.
  • SQLite (2000) — not a server. A C library that puts the whole database in one file. Ships in every phone, browser, and IoT device on Earth.
  • MariaDB (2009) — a fork of MySQL by the original author after Oracle bought MySQL. Mostly compatible, faster moving on features.
DBFirst releaseMaintainerLicenseDefault port
Oracle1979Oracle Corp.Proprietary1521
SQL Server1989MicrosoftProprietary (free Express)1433
MySQL1995Oracle Corp.GPL / Commercial dual3306
PostgreSQL1996PG Global Dev GroupPostgreSQL License (BSD-like)5432
SQLite2000D. Richard HippPublic domainN/A (file)
MariaDB2009MariaDB FoundationGPL3306
If relational databases were vehicles: Oracle is a fully-loaded enterprise truck — capable, expensive, needs a trained driver. SQL Server is a luxury sedan that only works well on Microsoft roads. MySQL is a reliable hatchback — fast, popular, gets you everywhere. Postgres is an electric SUV — quietly does everything the truck does, costs less, and arrives at the future first. SQLite is a folding bicycle — fits in your bag, takes you the last mile. MariaDB is the hatchback after a custom tuning shop got hold of it.
"All SQL is SQL" is a lie that breaks the moment you write a non-trivial query. The differences are real, and they're in the engine, not the syntax.
02

Storage Engines — How Each One Actually Writes to Disk

The biggest mental model gap between MySQL and Postgres is at the storage layer. Postgres has one storage model. MySQL has a marketplace of them. Oracle has its own block format that's been refined for 40+ years. Once you understand this, half the "why does X behave differently?" questions answer themselves.

Rahul is benchmarking a write-heavy workload — 10K inserts/sec across 50 tables. Postgres holds steady at 8K/sec, MySQL InnoDB does 12K/sec, MySQL MyISAM (his colleague suggested) does 28K/sec — and then loses three rows when the box reboots. "Oh, MyISAM isn't crash-safe." This is the moment storage engines stop being trivia.
What is a storage engine, and why does it matter?

A storage engine is the part of the database that decides how rows actually live on disk: the file format, the index layout, how writes are made durable, how concurrent readers see consistent data. The SQL parser and query planner sit above it. The engine is what gives you (or denies you) crash safety, MVCC, foreign keys, and transactions.

MySQL — Pluggable engines

MySQL uniquely lets you pick an engine per table. This was revolutionary in 2003 and is the source of a lot of MySQL's quirks.

  • InnoDB — the default since MySQL 5.5. Row-level locking, MVCC, foreign keys, crash recovery via redo log. Use this. Always.
  • MyISAM — older, table-level locks, no transactions, no FKs, but very fast for read-only workloads. Used in legacy MediaWiki/WordPress setups.
  • MEMORY — RAM-only, lost on restart. Useful for temp tables.
  • Archive — write-only compressed log table.
  • NDB — MySQL Cluster's distributed engine. Different beast entirely.
  • MyRocks — RocksDB (LSM-tree) packaged as a MySQL engine, originally by Facebook. Better for write-heavy SSD workloads.

PostgreSQL — One engine, no choice

Postgres has exactly one storage format: heap tables with MVCC tuples. Every row gets xmin (transaction that inserted it) and xmax (transaction that deleted/updated it) hidden columns. Updates are never in-place — they write a new tuple and mark the old one dead. Background VACUUM reclaims the dead ones.

Oracle — Blocks, segments, tablespaces

Oracle organizes data into blocks (typically 8KB) grouped into extents, which form segments stored in datafiles belonging to tablespaces. Heap-organized tables are default; Index-Organized Tables (IOT) cluster data inside the primary key index. Undo segments power read-consistency (Oracle's MVCC).

SQL Server — Pages and extents too

SQL Server stores data in 8KB pages, grouped into 64KB extents. Clustered indexes are the table — like InnoDB. Heap tables exist but are uncommon; almost everyone has a clustered index, usually on the primary key.

EngineOn-disk shapeUpdatesReclamation
InnoDB (MySQL)Clustered B-tree on PK; rows live in the leafIn-place when possible; otherwise pointer to new locationPurge thread cleans old MVCC versions from rollback segments
Postgres heapUnordered heap + separate index pagesAlways insert new tuple, mark old deadVACUUM (manual or auto) reclaims dead tuples
OracleBlocks within tablespaces; heap or IOTIn-place when row fits; chains/migrates otherwiseUndo segments age out; no global vacuum
SQL Server8KB pages, clustered B-tree on PK by defaultIn-place; row-overflow pages if neededTempdb + ghost cleanup task
The Postgres "no in-place update" choice means a heavily-updated table accumulates bloat. UPDATE users SET last_seen = now() for every user, every minute, creates millions of dead tuples. Autovacuum eventually catches up — until it doesn't, and you wake up to a 200GB table that should be 20GB.
MySQL has many engines, InnoDB is the answer. Postgres has one engine and pays for elegance with VACUUM. Oracle and SQL Server live in the middle with block-organized in-place updates. None of this is visible in your SQL — all of it shapes your operations.
03

MVCC — Three Different Implementations of "Readers Don't Block Writers"

Every modern relational database promises Multi-Version Concurrency Control: readers see a consistent snapshot, writers don't have to wait for them, writers don't have to lock the world. How each implements it is wildly different — and that difference shows up in your monitoring.

What is MVCC, in plain language?

Imagine you're reading page 50 of a book. Someone else wants to rewrite page 50. Without MVCC, you'd either block them ("don't touch the book while I'm reading") or get a mangled page. With MVCC, the writer creates a fresh copy of page 50, and you keep reading the original until you're done. Old copies are cleaned up later, once no one needs them.

That metaphor maps almost perfectly to the database: every row exists as multiple versions, each tagged with the transaction ID that created it. Your SELECT sees only versions visible to your transaction's snapshot. Writers create new versions. Old versions are reclaimed when no live transaction needs them.

Postgres — Version chains inside the heap

Every tuple carries (xmin, xmax). A new UPDATE writes a fresh tuple with xmin = current_txn and updates the old one's xmax. The old tuple lives on until VACUUM. Visibility: "I can see this row if my snapshot includes xmin and excludes xmax."

Postgres MVCC — what's actually in the heap
-- Initial insert (txn 100):
ROW: { id=1, name='Alice', xmin=100, xmax=0 }

-- UPDATE users SET name='Alicia' WHERE id=1; (txn 105)
-- Now the heap contains TWO physical rows:
ROW1: { id=1, name='Alice',  xmin=100, xmax=105 }   -- dead to txns > 105
ROW2: { id=1, name='Alicia', xmin=105, xmax=0   }   -- the live version

-- VACUUM eventually deletes ROW1 once no snapshot needs it.

InnoDB (MySQL) — Versions live in the undo log

InnoDB stores the current row in-place, and pushes the old version into the undo log (also used for rollback). Each row carries a 6-byte transaction ID and a 7-byte rollback pointer. When you SELECT under REPEATABLE READ, InnoDB walks the undo chain backward until it finds a version your snapshot can see.

Oracle — Undo segments, same idea

Oracle invented this approach in the 1980s. Old row versions live in undo segments. Long-running reports get the famous ORA-01555: snapshot too old error when the undo they need has been overwritten.

SQL Server — Optional MVCC

By default, SQL Server uses lock-based concurrency — readers block writers and vice versa. You opt into MVCC by enabling READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION, after which old versions go to tempdb. Tempdb sizing is a whole career.

DBWhere old versions liveHow they're reclaimedFailure mode
PostgresInline in the heap as dead tuplesVACUUM (autovacuum process)Table bloat, transaction-ID wraparound
InnoDBUndo log (system tablespace or per-table)Purge thread once trx no longer neededUndo grows unboundedly under long transactions
OracleUNDO tablespaceAged out by undo retention policyORA-01555 snapshot too old
SQL Servertempdb (when snapshot isolation enabled)Version-store cleanup tasktempdb fills, queries spill, server stalls
Postgres = writing your edits in the margins of the same book — you can see all the history, but you need a librarian to come prune it. InnoDB/Oracle = keeping the current page clean and storing the old draft in a side folder — neat in the book, but lose the folder and old readers crash. SQL Server (default) = everyone passes the book around one at a time; snapshot mode = "okay fine, photocopies in the back room."
If asked "why is my Postgres table 5× the size it should be?" — the answer is almost always dead tuples. SELECT n_dead_tup FROM pg_stat_user_tables WHERE relname = 'users' tells you. VACUUM FULL rewrites it, but locks the table; pg_repack does it online.
Every relational DB solves "readers don't block writers" with MVCC. Postgres stores old versions inline (and pays via VACUUM). InnoDB and Oracle store them in undo (and pay via undo growth). SQL Server reluctantly uses tempdb. All four can melt down if you keep a transaction open for hours.
04

Indexes — B-Tree Everywhere, But the Side Dishes Differ

Every relational DB uses a B-Tree (or B+Tree) as the workhorse index. What separates them is the specialized index types — and on Postgres, the variety is staggering. If you've only ever used B-Trees, you've been leaving free performance on the table.

What everyone has

  • B-Tree — ordered, supports equality + range + ORDER BY. The default.
  • Unique — B-Tree with a uniqueness constraint.
  • Composite — multi-column B-Tree. Order of columns matters (the "leftmost prefix" rule).
  • Covering / Included columns — extra columns stored in the index leaf so the table is never read.

What Postgres uniquely has

TypeWhat it's forExample
HashEquality-only lookups (rarely used — B-Tree usually wins)CREATE INDEX ... USING HASH (col)
GiSTGeneralized search tree — geospatial, ranges, full-text, kNNPostGIS uses it; range types use it
GINInverted index — JSONB, arrays, full-text, trigramsCREATE INDEX ... USING GIN (jsonb_col)
BRINBlock-range index — tiny, perfect for huge sorted tables (logs, time-series)CREATE INDEX ... USING BRIN (created_at)
SP-GiSTSpace-partitioned GiST — quad-trees, kd-trees, IP rangesRarely used directly
BloomMulti-column equality with low false-positive rate (extension)CREATE EXTENSION bloom
pgvector / HNSWVector similarity for AI/embeddingsCREATE INDEX ... USING hnsw (embedding vector_cosine_ops)
A real-world Postgres index for a logs table — 100GB / 500M rows
-- BRIN over the timestamp column: ~5 MB instead of ~15 GB B-tree
CREATE INDEX idx_logs_ts ON logs USING BRIN (created_at);

-- GIN over JSONB metadata for "find rows where meta contains tag X"
CREATE INDEX idx_logs_meta ON logs USING GIN (meta jsonb_path_ops);

-- Partial index: only "error" rows, which are 0.5% of the table
CREATE INDEX idx_logs_errors ON logs (created_at)
  WHERE level = 'ERROR';

MySQL InnoDB — Clustered index = the table

The PRIMARY KEY index is the table. Rows live in the leaf nodes of the PK B-tree. Secondary indexes store the PK value (not a row pointer), so every secondary lookup is two B-tree traversals. This is why a sequential BIGINT PK ("autoincrement") is so important in MySQL — it keeps inserts at the right edge of the tree.

Oracle — Index-Organized Tables (IOT)

Standard tables are heap-organized (rows scattered, row IDs reference physical location). IOTs cluster the table inside the PK index (like InnoDB). Oracle also has bitmap indexes — brilliant for low-cardinality columns in warehouses, terrible under concurrent writes.

SQL Server — Columnstore for OLAP

For analytical workloads, SQL Server's columnstore indexes store data column-by-column with compression, giving 10× compression and 100× query speed for star-schema queries. Postgres has cstore_fdw and the citus extension for similar; MySQL has no native columnstore.

"Add an index" is not a free lunch in any DB. Every write touches every relevant index. A table with 12 indexes does 12× the work on each INSERT. The cheapest index is the one you didn't add.
B-Tree everywhere. Postgres adds GIN/GiST/BRIN for JSON, full-text, and time-series. MySQL InnoDB clusters by PK. Oracle has bitmap + IOT. SQL Server has columnstore for analytics. Match the index type to the access pattern, not the DB's default.
05

Isolation Levels — Same Words, Different Behavior

The SQL standard defines four isolation levels — READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE. Every relational database claims to support them. None of them implement them the same way. Knowing the gaps is the difference between code that works in dev and code that mysteriously double-charges your customers in production.

The standard's four levels and what they forbid

LevelDirty readNon-repeatable readPhantom readLost update
READ UNCOMMITTEDAllowedAllowedAllowedAllowed
READ COMMITTEDPreventedAllowedAllowedAllowed
REPEATABLE READPreventedPreventedAllowedPrevented*
SERIALIZABLEPreventedPreventedPreventedPrevented

What actually happens, by DB

DBDefaultREAD COMMITTEDREPEATABLE READSERIALIZABLE
PostgresREAD COMMITTEDStatement-level snapshotTransaction-level snapshot; phantoms blocked by side effectTrue serializable via Serializable Snapshot Isolation (SSI)
MySQL InnoDBREPEATABLE READStatement snapshotTransaction snapshot + next-key locks block phantomsFull table-level S-locks; rarely used
OracleREAD COMMITTEDStatement snapshotNot supported (skipped to Serializable)True snapshot isolation
SQL ServerREAD COMMITTED (lock-based)Locks unless snapshot enabledRange locks held to commitStrict 2PL with key-range locks
"REPEATABLE READ" in MySQL is not the same as "REPEATABLE READ" in Postgres. MySQL's blocks phantoms (via next-key locks); Postgres's allows them. Postgres's SERIALIZABLE is "true" serializable; MySQL's serializes via table locks and is rarely used. Always benchmark the actual behavior, not the label.
Lost update — same code, different outcome
-- Session A:
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- reads 100

-- Session B (concurrently):
BEGIN;
UPDATE accounts SET balance = balance - 30 WHERE id = 1;
COMMIT;  -- balance now 70

-- Session A (still in transaction, naive update):
UPDATE accounts SET balance = 100 - 50 WHERE id = 1;  -- writes 50
COMMIT;
-- Final balance: 50 — but should be 20! Session B's deduction is LOST.

The fix in every DB is the same: use SELECT ... FOR UPDATE, or use a write that doesn't depend on the read (UPDATE ... SET balance = balance - 50), or use SERIALIZABLE. Knowing which one your isolation level forces is the trick.

Default isolation differs! Postgres = READ COMMITTED. MySQL = REPEATABLE READ. Code that "works" on MySQL because of next-key locks may have silent races when ported to Postgres. Always test the same isolation level you'll run in prod.
Same level names, different guarantees. Postgres SERIALIZABLE uses SSI and is fast enough for real use. MySQL REPEATABLE READ blocks phantoms. Oracle skips REPEATABLE READ entirely. SQL Server is lock-based unless you opt in to snapshot. Treat isolation labels as starting points, not promises.
06

Replication — Statement, Row, WAL, Redo — Pick Your Poison

Every production database has a replica. How that replica is kept in sync defines what you can do (read scaling, failover, point-in-time recovery, logical CDC) and what can break (replication lag, conflicts, data loss).

The two families

  • Physical replication — ship the literal bytes of the change log. Replicas are byte-for-byte identical. Fast, simple, but no cross-version or cross-platform replication and you can only replicate the entire cluster.
  • Logical replication — ship a logical description of the change ("row X changed column Y from A to B"). Replicas can be different versions, can subscribe to a subset of tables, can transform data. More flexible, more failure modes.

By database

DBNative physicalNative logicalCommon tooling
PostgresStreaming replication (WAL shipping)Logical replication (since 10), pglogicalPatroni, Stolon, repmgr, pg_basebackup
MySQLBinary log replication (statement, row, or mixed format)Same binlog in ROW format is effectively logicalMHA, Orchestrator, Group Replication, ProxySQL
OracleData Guard (redo apply)GoldenGateRAC + Data Guard
SQL ServerAlways On Availability GroupsTransactional / merge replication, CDCAlways On listener, log shipping

MySQL binlog formats — the source of subtle bugs

FormatWhat's shippedProsCons
STATEMENTThe SQL textCompact, readableNon-deterministic functions (NOW(), UUID()) drift between primary and replica
ROWBefore/after row imagesAlways correct, replication-safeBigger logs; harder to debug
MIXEDStatement, fall back to row when unsafeBest of bothSubtle when you check what was logged
Postgres replica setup — one line
# On the primary's postgresql.conf:
wal_level = replica          # or 'logical' for logical replication
max_wal_senders = 10
hot_standby = on             # replicas accept read queries

# On the replica, after pg_basebackup:
primary_conninfo = 'host=primary.db port=5432 user=replicator'
# Streaming replication is now live. Replica catches up by reading WAL.
Replication lag is the silent killer of "read from the replica" architectures. If you write and then immediately read from a replica, you may not see your own write. Solutions: read-your-writes routing (write-then-read uses primary), session pinning, or causal consistency tokens (PostgreSQL 14+ has pg_current_wal_lsn() handoff).
Physical replication = identical byte copies, simple, all-or-nothing. Logical = flexible, table-level, cross-version, but more failure modes. MySQL binlog ROW format is effectively logical. Postgres ≥10 has logical replication baked in. Oracle's GoldenGate is the gold standard but costs accordingly.
07

Data Types — Beyond INT and VARCHAR

Every database has the same basic types. The interesting differences live in the extras — and Postgres is the clear winner here, with built-in support for things that need third-party libraries everywhere else.

TypePostgresMySQLOracleSQL Server
Auto-increment PKBIGSERIAL, GENERATED AS IDENTITYAUTO_INCREMENTSEQUENCE + trigger or IDENTITY (12c+)IDENTITY(1,1)
BooleanNative BOOLEANTINYINT(1) (no real bool)No native bool; NUMBER(1) or CHAR(1)BIT
UUIDNative UUID typeBINARY(16) or CHAR(36)RAW(16) or VARCHAR2(36)Native UNIQUEIDENTIFIER
JSONJSONB (binary, indexed)JSON (binary)JSON (21c) or CLOBNVARCHAR(MAX) with JSON functions
ArrayNative (any type)JSON-encoded onlyVARRAY / nested tablesTable types, JSON, XML
Range typesNative (int4range, tstzrange, ...)
Network addressNative INET, CIDR
GeospatialPostGIS (de facto standard)Spatial types (basic)Oracle Spatialgeometry / geography
MoneyNUMERIC (recommended) / MONEYDECIMALNUMBERMONEY / DECIMAL
EnumNative CREATE TYPE ... AS ENUMENUM('a','b') column
Vector (embeddings)pgvector extension23ai native
Postgres-only types that earn their keep
-- Range type: a booking that occupies a time interval
CREATE TABLE bookings (
  id    BIGSERIAL PRIMARY KEY,
  room  INT,
  span  TSTZRANGE,
  -- exclusion constraint: no two bookings for the same room can overlap
  EXCLUDE USING GIST (room WITH =, span WITH &&)
);

-- Array column: store tags inline without a join table
CREATE TABLE posts (
  id   BIGSERIAL PRIMARY KEY,
  tags TEXT[]
);
SELECT * FROM posts WHERE tags @> ARRAY['java', 'kafka'];

-- INET: store IPs natively, query by subnet
SELECT * FROM requests WHERE client_ip <<= '10.0.0.0/8'::INET;
MySQL has no real boolean. TINYINT(1) is just an integer that conventionally stores 0/1. WHERE active = TRUE compares against the integer 1; WHERE active = 'TRUE' coerces 'TRUE' to 0 and silently returns rows where active = 0. Use explicit integers and you'll thank yourself.
Postgres has built-in types for half the things you'd otherwise model with extra tables or string-coding. MySQL's type system is plainer. Oracle is rich but its NUMBER type and lack of native BOOLEAN make porting painful. SQL Server lands in the middle.
08

JSON Support — Same Word, Very Different Capabilities

Every modern relational DB stores JSON. None of them are equivalent. Knowing the differences before you bet a project on "we'll just use JSON columns" is worth its weight in incident postmortems.

Postgres — JSONB is the gold standard

Two types: JSON (stored as text, preserves whitespace and key order) and JSONB (binary, parsed once, indexed, fast). Almost everyone uses JSONB. You get path queries, containment operators, indexing via GIN, and full JSONPath since version 12.

Postgres JSONB — queryable, indexable, fast
CREATE TABLE events (id BIGSERIAL, payload JSONB);
CREATE INDEX idx_payload ON events USING GIN (payload jsonb_path_ops);

-- Query: events where payload.type = 'login' AND payload.region = 'IN'
SELECT * FROM events
WHERE payload @> '{"type":"login","region":"IN"}';

-- Extract a field
SELECT payload->>'user_id' AS user_id FROM events;

-- Update a nested value (returns a new JSONB; you write it back)
UPDATE events SET payload = jsonb_set(payload, '{status}', '"ok"') WHERE id = 42;

MySQL — Decent JSON, fewer index tricks

MySQL 5.7+ has a native JSON type stored as binary. You query with JSON_EXTRACT() or the ->/->> operators. Indexing requires generated columns — you create a virtual column from a JSON expression and index that.

MySQL JSON — works, but indexing is a dance
CREATE TABLE events (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  payload JSON,
  user_id BIGINT GENERATED ALWAYS AS (payload->>'$.user_id') STORED,
  INDEX idx_user (user_id)
);
SELECT * FROM events WHERE JSON_EXTRACT(payload, '$.type') = 'login';

Oracle — JSON inside CLOB / BLOB, or native in 21c+

Pre-21c, JSON was a constraint on a CLOB column. From 21c, JSON is a first-class native type with binary storage and indexing. Query syntax uses JSON path ($.field) similar to MySQL.

SQL Server — JSON as text

SQL Server has JSON functions (JSON_VALUE, OPENJSON, FOR JSON PATH) but no native JSON type. You store it in NVARCHAR(MAX) and index via computed columns. Functional, not luxurious.

CapabilityPostgres JSONBMySQL JSONOracle 21c JSONSQL Server
Native binaryYesYesYes (21c+)No (text)
Generic indexGINJSON search index
Indexable via computed colYesYesYesYes
Containment @>YesJSON_CONTAINS()JSON_EXISTSOPENJSON
Partial in-place updateHot updates via JSONB opsYes (5.7+)YesNo
"Just put it in JSON" is a debt that compounds. Querying nested JSON without an index = full table scan. Aggregating JSON fields = parse on every row. If you query a JSON field often, lift it to a column. JSON is great for occasionally-read or schemaless-by-nature data — not as a substitute for proper modeling.
Postgres JSONB is the only relational JSON story you'd build a product on. MySQL JSON works for moderate use. Oracle joined the party properly in 21c. SQL Server treats JSON as text with helper functions. If JSON is your primary access pattern, consider a document DB — but if it's a 10–20% slice, JSONB is plenty.
09

Stored Procedures & SQL Dialects — Where Vendor Lock-in Lives

SQL itself is mostly portable. The procedural language built around it isn't. PL/SQL (Oracle), T-SQL (SQL Server), PL/pgSQL (Postgres), and MySQL's stored procedure dialect are all different. Code that lives in the database is the hardest code to migrate.

DBProcedural languageExternal languages allowed
OraclePL/SQL — the most mature, can do almost anythingJava (stored Java procedures)
SQL ServerT-SQL — close cousin to PL/SQLCLR (C#, VB) procedures
PostgresPL/pgSQL (default) + PL/Python, PL/Perl, PL/V8 (JS), PL/Java, PL/RAnything — extensible
MySQLSQL stored procedures (limited, deprecated by Oracle's stewardship)No external langs

Same logic, four dialects — auto-incremented sequence

Oracle PL/SQL
CREATE SEQUENCE order_seq START WITH 1;

CREATE OR REPLACE TRIGGER orders_bi
BEFORE INSERT ON orders
FOR EACH ROW
BEGIN
  :NEW.id := order_seq.NEXTVAL;
END;
Postgres
CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  ...
);
-- Or simpler: BIGSERIAL is shorthand for "INT + sequence + default"
MySQL
CREATE TABLE orders (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  ...
);

Returning the inserted row

This is a perfect example of "looks portable, isn't":

Postgres — RETURNING is built into INSERT
INSERT INTO users (email) VALUES ('p@x.com')
RETURNING id, created_at;
MySQL — two-step
INSERT INTO users (email) VALUES ('p@x.com');
SELECT LAST_INSERT_ID();
Oracle — RETURNING ... INTO
INSERT INTO users (email) VALUES ('p@x.com')
RETURNING id INTO :v_id;

UPSERT — every DB's favorite party trick

Postgres — ON CONFLICT
INSERT INTO counters (key, n) VALUES ('a', 1)
ON CONFLICT (key) DO UPDATE SET n = counters.n + 1;
MySQL — ON DUPLICATE KEY UPDATE
INSERT INTO counters (`key`, n) VALUES ('a', 1)
ON DUPLICATE KEY UPDATE n = n + 1;
Oracle — MERGE statement
MERGE INTO counters c
USING (SELECT 'a' AS k, 1 AS v FROM dual) src
ON (c.key = src.k)
WHEN MATCHED THEN UPDATE SET c.n = c.n + 1
WHEN NOT MATCHED THEN INSERT (key, n) VALUES (src.k, src.v);
Stored-procedure code is the deepest form of vendor lock-in. A 50K-line PL/SQL codebase is essentially un-migratable; companies pay Oracle license fees for decades because of this. If portability matters, keep business logic in your app, not in the DB.
SQL is 80% portable. Procedural code is 0% portable. Auto-increment, UPSERT, RETURNING, sequences — every common pattern looks different in every DB. Standardize on one and learn its dialect; or stay in plain SQL and accept some duplication in your app.
10

Full-Text Search & Vector Search — The Modern Asks

"Search" used to mean Elasticsearch. Now relational DBs have caught up enough that for many products, you can skip the extra service.

Full-text search

DBNative FTSNotes
Postgrestsvector + tsquery, ranking via ts_rankGIN-indexed. Multilingual dictionaries. Production-grade.
MySQLFULLTEXT indexes (InnoDB and MyISAM)Boolean + natural-language modes; weaker stemming than PG
OracleOracle Text — extremely powerfulEnterprise feature, full linguistic suite
SQL ServerFull-text indexes with stemming & thesaurusDecent but quirky setup
SQLiteFTS5 module — built-inSurprisingly good for what it is
Postgres FTS — production-ready in 5 lines
ALTER TABLE articles ADD COLUMN tsv TSVECTOR
  GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED;
CREATE INDEX idx_articles_tsv ON articles USING GIN(tsv);

SELECT id, title, ts_rank(tsv, query) AS rank
FROM articles, plainto_tsquery('english', 'replication lag') query
WHERE tsv @@ query
ORDER BY rank DESC
LIMIT 10;

Vector search — the AI/embeddings era

DBVector supportIndex type
Postgrespgvector extension (now installed by AWS RDS, GCP CloudSQL, Azure)HNSW, IVFFlat
Oracle 23aiNative VECTOR type and similarity functionsNative HNSW
SQL Server 2025+Native vector type and similarity search (preview)Limited
MySQL HeatWaveVector support in HeatWave (cloud only)Limited
SQLitesqlite-vec extensionBrute-force + IVF
pgvector — search by semantic similarity
CREATE EXTENSION vector;

CREATE TABLE docs (
  id BIGSERIAL,
  content TEXT,
  embedding VECTOR(1536)            -- e.g. OpenAI ada
);
CREATE INDEX idx_docs_embedding ON docs
  USING hnsw (embedding vector_cosine_ops);

-- Find the 5 docs closest to a query embedding
SELECT id, content
FROM docs
ORDER BY embedding <=> '[0.012, -0.004, ...]'::VECTOR
LIMIT 5;
For most apps in 2026, "Postgres + pgvector + tsvector" replaces both Elasticsearch and a vector DB. Oracle 23ai is closing the gap aggressively. MySQL and SQL Server lag for now. SQLite even has both — perfect for on-device RAG apps.
11

Concurrency & Locking — Where Deadlocks Are Born

Two transactions modifying the same data is the source of every interesting database story. The locking strategies differ in granularity (row, page, table), in scope (data + index keys?), and in whether reads block writes (lock-based) or coexist (MVCC).

Lock granularity

DBDefault lock granularityEscalation
PostgresRow-level for writes (no page locks). Table-level for DDL.None — Postgres tracks locks per-row in shared memory; you can run out of max_locks_per_transaction
MySQL InnoDBRow-level + next-key (gap) locks under RRNone — but gap locks can lock more than you expect
OracleRow-level, no lock escalation, no lock manager (locks stored in data blocks)None — Oracle's "locks are free" boast
SQL ServerRow → page → table lock escalation (default ~5000 row locks)Lock escalation can surprise; ALTER TABLE ... SET (LOCK_ESCALATION = DISABLE)

Lock modes you'll encounter

  • Shared (S) — multiple readers can hold it.
  • Exclusive (X) — only one holder, blocks all others.
  • Update (U) (SQL Server, MySQL) — intent-to-write, prevents lock upgrade deadlocks.
  • Intent (IS, IX) — at the parent level (table) declaring that some child (row) is locked.
  • Gap / next-key (InnoDB) — locks the range between index keys to block phantom inserts under RR.

Explicit row locking — almost portable

Worker queue, the cross-DB way
-- Postgres / MySQL 8.0 / Oracle / SQL Server (with READPAST)
SELECT id, payload FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- The row is yours; other workers skip past it.
MySQL added SKIP LOCKED in 8.0. Postgres has had it since 9.5. SQL Server uses WITH (UPDLOCK, READPAST) as the hint syntax. Worker-queue code written on one is rarely portable to another without rewrites.

Deadlocks

Every DB detects deadlocks (cycles in the lock graph) and aborts one transaction. The differences:

  • Postgres & MySQL InnoDB — detect after a short timeout, pick the cheapest victim (least work done) and roll it back. Your client gets a deadlock error and must retry.
  • Oracle — detects via a cycle-finder thread; rolls back the transaction that closed the cycle.
  • SQL Server — fastest detector; you can set DEADLOCK_PRIORITY to influence the victim choice.
Every relational DB locks rows for writes and detects deadlocks. The differences live in lock escalation (SQL Server), gap locks (InnoDB), and lock storage (Oracle puts locks in data blocks; everyone else uses a lock manager). Always handle deadlock errors with retry logic.
12

Partitioning & Sharding — Living Past a Single Box

Every database eventually outgrows one machine. The relational world splits into two answers: partition the table (still one DB, split internally) and shard the database (multiple DBs, one logical table). Built-in support for each varies wildly.

Partitioning — built-in

DBPartition typesNotes
PostgresRange, list, hash (since 10)Native declarative since 10; mature since 12. Foreign keys to partitioned tables since 12.
MySQLRange, list, hash, key, columns variantsInnoDB only (since 5.7). Limit ~8K partitions per table.
OracleRange, list, hash, composite, interval, reference, systemMost flexible by far; partition advisor and online operations
SQL ServerRange only (via partition functions)Up to 15K partitions; integrates with partition switching for fast loads
Postgres declarative partitioning
CREATE TABLE events (
  id BIGINT,
  ts TIMESTAMPTZ NOT NULL,
  body JSONB
) PARTITION BY RANGE (ts);

CREATE TABLE events_2026_01 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Inserts route automatically; queries with ts predicate prune partitions
SELECT * FROM events WHERE ts >= '2026-02-01';

Sharding — bring your own

Built-in horizontal scale across machines is where the open-source relational DBs don't shine. The common answers:

DBSharding optionWhat it gives you
PostgresCitus extension (now part of Microsoft Azure)Distributed Postgres — transparent sharding, distributed JOINs
MySQLVitess (used by YouTube, Slack, Pinterest)Sharding middleware with online resharding
OracleOracle Sharding (12.2+)Native — system-managed or user-defined
SQL ServerElastic Database / sharded pools on AzureCloud-managed sharding
NewSQLCockroachDB, Spanner, TiDB, YugabyteDBSharding built into the storage layer from day one
Partitioning = splitting one filing cabinet into labeled drawers. The cabinet (DB) stays in one room. Sharding = putting drawers in different buildings. Now you need a routing service to find anything, and cross-building queries get expensive.
Sharding kills cross-shard transactions. If your invariant spans shards (e.g., transfer money between two users on different shards), you need two-phase commit, sagas, or to redesign so the invariant fits on one shard. This is the most common reason teams abandon sharding mid-project.
Partitioning is straightforward in all four major DBs. True sharding is an extension/middleware story (Citus, Vitess) or requires moving to NewSQL. Avoid sharding as long as one big machine handles your load — modern hardware pushes that ceiling much higher than people assume.
13

High Availability & Failover — Who Becomes Primary When?

Replication gives you copies. HA decides what happens when the primary dies — who promotes a replica, when, and how do you stop two primaries from existing at once (split-brain).

DBNative HAHow it elects a new primary
PostgresNone native — Patroni, repmgr, StolonPatroni uses a DCS (etcd/Consul/Zookeeper) for leader election; promotes the most caught-up replica
MySQLGroup Replication, InnoDB Cluster (with MySQL Router)Group Replication uses Paxos-style consensus; usually quorum of 3 or 5
OracleData Guard (replicas) + RAC (active-active on shared storage)Fast Start Failover via Data Guard observer; RAC has no failover — all nodes are active
SQL ServerAlways On Availability GroupsWindows Server Failover Cluster + AG listener

Synchronous vs asynchronous replication

The core trade-off is durability vs latency:

  • Async — primary commits, ships WAL to replicas later. Fast. If primary dies after commit but before ship, those commits are lost.
  • Sync — primary waits for at least one replica to acknowledge before returning success. Slower. No data loss on primary failure.
  • Semi-sync (MySQL) — wait for at least one replica's receipt, not full apply. Middle ground.
  • Quorum (Group Replication, Spanner, Postgres with quorum committee) — wait for majority. Tolerates one failure with no data loss.
Postgres synchronous_commit options
# In postgresql.conf:
synchronous_commit = on              # default — fsync to local WAL
synchronous_commit = remote_write    # wait for replica receipt
synchronous_commit = remote_apply    # wait for replica replay
synchronous_commit = off             # fastest, may lose ~3 commits on crash

synchronous_standby_names = 'ANY 2 (replica1, replica2, replica3)'
# Commit succeeds when any 2 of these 3 replicas ack.
A common pattern: enable sync replication to a single specific replica. The day that replica goes down, every write blocks indefinitely waiting for an ack that never comes. Always specify ANY N or have explicit failover for the sync replica too.
Async = fast, can lose commits. Sync to one = blocks on replica failure. Quorum (ANY N of M) is the modern answer — Postgres, MySQL Group Replication, NewSQL all support it. Oracle RAC sidesteps the question by being active-active on shared storage (expensive but no failover at all).
14

Performance Profile — Who's Faster, and at What?

"X is faster than Y" is meaningless without the workload. Here's the workload-by-workload truth — each row of the table below is a deep story about what the database is actually doing under the hood. Skim the table; read the subsection that matches your real workload.

WorkloadWinnerOne-line why
Single-row PK lookup, very high QPSMySQL InnoDBClustered PK + sequential autoinc + adaptive hash index
Complex analytical queries (joins, window funcs)Postgres, SQL Server, OracleMature cost-based planners + parallel query + JIT
Write-heavy with random keysMyRocks (LSM), Postgres tunedLSM trees absorb random writes; B-trees thrash
Mixed OLTP + reporting (HTAP)Oracle, SQL ServerResource governors + columnstore prevent noisy-neighbor
JSON-heavy with queriesPostgres JSONBBinary format + GIN indexes = indexed path queries
Time-seriesTimescaleDB / BRINHypertables + tiny BRIN indexes on naturally-sorted data
Embedded / on-deviceSQLiteNo server, no IPC; function calls into a C library
GeospatialPostgres + PostGIS20+ years of mature R-tree / GiST spatial indexing

① Single-row PK lookup at very high QPS — MySQL InnoDB

This is the workload InnoDB was built for, and the design decisions compound. Each one is small; together they explain the 30–40% edge MySQL holds on point reads.

The clustered index IS the table. In InnoDB, the primary key B+Tree's leaf nodes contain the actual row data — not pointers to it. A lookup like SELECT * FROM users WHERE id = 42 is one B+Tree traversal and the row is right there. Postgres splits this into two structures: a separate heap (unordered rows) and a B-Tree index that points into it. Index lookup → heap fetch = two cache lines touched instead of one.

Sequential AUTO_INCREMENT keeps the hot region tiny. Monotonic keys (1, 2, 3, …) mean every insert lands at the rightmost leaf of the B+Tree. The next ~1000 rows live in one 16KB page, that page stays pinned in the buffer pool, and new rows are written sequentially to disk — no page splits, no fragmentation. Switch the PK to a UUID and every insert lands at a random spot in the tree; throughput drops 3–5×. This is why "use BIGINT AUTO_INCREMENT, never UUID v4 as your PK" is the loudest community rule in MySQL.

The Adaptive Hash Index — the silent weapon. InnoDB watches access patterns. When certain PK values get hit repeatedly, it builds an in-memory hash index on top of the B+Tree, transparently. Subsequent lookups become O(1) hash hits, skipping the B-tree entirely. You don't configure it. You don't see it in EXPLAIN. It just sits there making hot reads faster. Postgres has no equivalent.

SHOW ENGINE INNODB STATUS — the AHI in action
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Hash table size 4425293, node heap has 1234 buffer(s)
139.93 hash searches/s, 12.41 non-hash searches/s

That ratio (~92% hash hits) is the AHI catching most PK lookups before they ever touch the B-tree.

Tiny per-row overhead. InnoDB rows carry 6 bytes of transaction ID + 7 bytes of rollback pointer = 13 bytes of MVCC header. Postgres tuples carry a 24-byte header (xmin, xmax, cmin, cmax, ctid, infomask). For a small row, that's 30% more on-disk size in Postgres — fewer rows per page, more I/O per lookup.

No version-chain walk for the live row. At READ COMMITTED, InnoDB reads the current row in-place; only if your snapshot needs an older version does it walk the undo log. Postgres always reads the heap tuple, then checks visibility, then potentially walks HOT chains — extra bookkeeping per read.

InnoDB's clustering elegance has a cost: secondary indexes store the PK, not a row pointer. So WHERE email = ? is two B+Tree traversals (email index → PK → row). If your PK is a 36-byte UUID string, every secondary index is bloated by carrying it in every leaf. Yet another reason narrow integer PKs matter in MySQL even more than in Postgres.

Numbers, single 32-core / NVMe / all-in-RAM box: InnoDB ~250K point-read QPS, Postgres ~180K. Sequential inserts: InnoDB ~150K/s, Postgres ~110K/s. The edge is real and consistent — and disappears the moment your query is more complex than WHERE id = ?.

② Complex analytical queries — Postgres, SQL Server, Oracle

The moment a query has 5+ joins, window functions, CTEs, or aggregations across millions of rows, the bottleneck stops being storage and becomes the query planner. This is where Postgres, SQL Server, and Oracle pull ahead — sometimes by 10× or more — over MySQL.

Cost-based optimization with deep statistics. All four DBs are cost-based, but the depth of statistics differs. Postgres tracks column histograms, n-distinct estimates, most-common-value lists, and (since 10) extended statistics over correlated columns. Oracle's optimizer has 40 years of accumulated hints, hashes, and heuristics. SQL Server has cardinality-estimation versions you can choose between. MySQL 8's optimizer is improving — hash joins arrived in 8.0.18, after Postgres had them for two decades — but the planner still chokes on queries that the other three handle without breaking a sweat.

Join algorithms. Postgres, SQL Server, and Oracle implement nested-loop, hash-join, and merge-join, and the planner picks dynamically based on row estimates. MySQL was nested-loop-only until 8.0.18 (yes, you read that right). A query like "join 50M orders to 100K customers on a non-indexed column" was effectively O(N×M) in MySQL — minutes — while Postgres did a hash join in seconds.

Parallel query. Postgres added parallel sequential scan in 9.6 and now parallelizes joins, aggregates, and index scans. Oracle and SQL Server have parallelized everything for decades. MySQL's parallel execution arrived in 8.0 but only for specific operations (CHECK TABLE, parallel index reads in 8.0.14). For a 1B-row aggregate, Postgres with max_parallel_workers_per_gather = 8 finishes in 1/6th the wall time of single-threaded execution.

JIT compilation. Postgres 11+ compiles query expressions to native code via LLVM for queries above a cost threshold — a 30M-row aggregation that interprets one tuple at a time can become a tight machine-code loop. SQL Server and Oracle have their own equivalent (batch-mode execution + adaptive memory grants). MySQL has none.

Window functions, recursive CTEs, lateral joins. All standard since SQL:2003. Postgres has had window functions since 8.4 (2009); SQL Server and Oracle have had them longer. MySQL added them in 8.0 (2018). Code written against PG/SQL Server/Oracle often won't even parse on older MySQL.

A query that runs in seconds on Postgres, minutes on MySQL 5.7
-- "Find each customer's top 3 orders by value, plus a running total"
SELECT customer_id, order_id, amount,
       RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rk,
       SUM(amount) OVER (PARTITION BY customer_id ORDER BY placed_at) AS running_total
FROM orders
WHERE placed_at >= NOW() - INTERVAL '90 days';

Numbers: on a 100M-row analytical workload (TPC-H Q5-style), Postgres 16 with parallel query routinely runs 5–10× faster than MySQL 8 on the same hardware. Oracle and SQL Server typically beat both — at the cost of license fees.

"My MySQL query is slow" almost always means the planner picked a bad plan. FORCE INDEX hints are a common smell. Same query on Postgres rarely needs hints because the planner has more options to consider. If your roadmap has growing analytical complexity and you're on MySQL, plan the Postgres migration before the query latency becomes a crisis.

③ Write-heavy with random keys — MyRocks, Postgres (tuned)

The default storage in every relational DB is a B-Tree. B-Trees are wonderful for ordered access and range scans — and a disaster for sustained random writes at scale. This is the workload that LSM trees (Log-Structured Merge) were invented to handle.

Why B-Trees suffer. Every insert finds its position in the tree (an O(log n) lookup), reads the page, modifies it, and writes it back. If your keys are random (UUIDs, hashed IDs), every insert lands on a different page. The buffer pool churns — pages get pulled in, dirtied, evicted, pulled in again. Page splits cascade. SSDs feel some of this pain less than spinning disks but they still pay the write amplification cost (10–20× more bytes written than logical data).

How LSM trees fix it. An LSM tree accepts writes into an in-memory sorted structure (memtable). When that fills, it's flushed to disk as a sorted file (SSTable). New SSTables stack on top of older ones. A background process compacts them — merging and discarding overwrites. Writes are always sequential. Random updates become append-only.

LSM write path — sequential, always
WRITE arrives:
  1. Append to WAL on disk    // sequential — fast
  2. Insert into memtable      // in-memory sorted

Memtable fills (~64MB):
  3. Flush to SSTable on disk  // sequential write of sorted data

Background:
  4. Compact older SSTables    // merge + dedupe + drop tombstones

MyRocks packages RocksDB as a MySQL engine. Originally built by Facebook to shrink their UserDB MySQL footprint; they reported ~50% space savings and ~50% write reduction vs InnoDB. For SSD-heavy write workloads (1M+ writes/sec), MyRocks does 3–5× the sustained throughput of InnoDB with random keys.

Postgres's middle ground. Postgres is B-Tree-based, but it has tricks that soften random-write pain: HOT (Heap-Only Tuples) updates avoid index updates when no indexed column changed; fillfactor < 100 leaves slack in pages so updates stay in-place; partitioning lets you drop entire chunks instead of deleting row-by-row. With these dials tuned, Postgres can handle very heavy writes — just not at the scale where MyRocks shines.

EngineTypeRandom-write strengthRead penalty
InnoDBB+TreeModerate (page churn under random keys)None
Postgres heapHeap + B-Tree indexModerate (HOT updates help)None
MyRocksLSM (RocksDB)Excellent — sequential alwaysSlight — reads may check multiple SSTables
Cassandra / ScyllaDBLSMExcellent (write-optimized DB)Higher for ranges

The catch: LSM reads are slower because a key may live in the memtable, in any of several SSTables, or have been deleted (tombstoned). Bloom filters mitigate this but don't eliminate it. If your workload is 80% writes / 20% reads, LSM wins. If it's 80% reads / 20% writes, B-Tree wins. For 50/50, benchmark both — there's no shortcut.

④ Mixed OLTP + reporting (HTAP) — Oracle, SQL Server

"Real-time dashboards on the production database" is the dream that haunts every architect. The problem is the noisy-neighbor effect: a reporting query that scans 50M rows starves the buffer pool of OLTP-relevant pages, holds shared locks under lock-based isolation, and saturates I/O bandwidth. OLTP latency spikes from 2ms to 200ms while the report is running.

Oracle's answer: Resource Manager + Materialized Views with Query Rewrite. Resource Manager lets you guarantee CPU, I/O, and parallelism per consumer group — analytics gets capped at 20% CPU, OLTP gets the rest. Materialized views with automatic refresh pre-compute reporting answers; the optimizer can transparently rewrite SELECT SUM(...) queries to hit the MV instead of the base table. Oracle's In-Memory Column Store (12c+) caches the same data in both row format (for OLTP) and columnar format (for analytics) — same buffer pool, two representations.

SQL Server's answer: Resource Governor + Columnstore indexes. Resource Governor is similar to Oracle's. Columnstore indexes are the killer feature — same table, both rowstore (for OLTP) and columnstore (for analytics) indexes. Analytical queries hit the columnstore and get 10–100× speedup with 5–10× compression; OLTP keeps using the clustered rowstore. SQL Server tracks which rows have been recently modified and merges deltas into the columnstore lazily.

SQL Server — clustered rowstore + nonclustered columnstore on one table
-- Clustered rowstore for OLTP point reads
CREATE TABLE orders (
  id BIGINT IDENTITY PRIMARY KEY,
  customer_id BIGINT,
  amount DECIMAL(19,4),
  placed_at DATETIME2
);

-- Nonclustered columnstore for analytics on same table
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_orders_cs
  ON orders (customer_id, amount, placed_at);

-- Point read uses the clustered rowstore (~1ms)
SELECT * FROM orders WHERE id = 42;

-- Aggregate uses the columnstore (~100ms instead of 30s)
SELECT customer_id, SUM(amount)
FROM orders WHERE placed_at >= '2026-01-01'
GROUP BY customer_id;

Why Postgres and MySQL struggle here. Postgres has parallel query (great) but no resource governor (a runaway analytical query can exhaust connections and slow OLTP). Native columnstore is missing — extensions like citus columnar and hydra are still maturing. MySQL has none of these, plus a weaker planner — running heavy analytics on the OLTP master is genuinely dangerous.

The modern alternatives. If HTAP is the requirement, the realistic options in 2026 are: SQL Server with columnstore, Oracle with In-Memory, SingleStore (purpose-built HTAP), or TiDB with its TiKV (rowstore) + TiFlash (columnstore) split. Postgres shops usually solve it by streaming CDC to a separate analytics warehouse (ClickHouse, BigQuery, Snowflake) — works well but adds operational complexity.

"We'll just run analytics on a read replica" is the common fallback, and it works — until the analytics query holds an open transaction for 20 minutes and replication lag spirals because the replica can't apply WAL until the transaction releases. The same problem in different clothing.

⑤ JSON-heavy with queries — Postgres JSONB

Storing JSON is easy in every DB. Querying it with index speed is where Postgres pulls ahead by a margin that's actually hard to overstate.

JSONB is a parsed binary tree. When you INSERT '{"user":"alice","tags":["a","b"]}' as JSONB, Postgres parses it once, validates it, and stores it as a binary structure where keys are interned and values are typed. Every subsequent read avoids re-parsing. MySQL JSON does the same. SQL Server's NVARCHAR(MAX) + JSON functions does not — every query re-parses the text.

GIN indexes — the real superpower. A GIN (Generalized Inverted iNdex) on a JSONB column indexes every key, every value, and every path inside the document. The containment operator @> can then satisfy queries like "find all docs whose type is 'login' AND region is 'IN'" in milliseconds against 100M-row tables.

Postgres JSONB + GIN — production speed
CREATE INDEX idx_events_payload ON events
  USING GIN (payload jsonb_path_ops);

-- Indexed lookup on nested JSON keys — milliseconds even at 100M rows
SELECT id, payload->>'user_id' AS uid
FROM events
WHERE payload @> '{"type":"login","region":"IN"}'
  AND created_at >= NOW() - INTERVAL '1 day';

What the others can do. MySQL JSON requires generated virtual columns to be indexable: extract the field into a column, index that column. Works, but you need to know up front which fields you'll query. SQL Server is in the same boat. Oracle 21c added a proper JSON search index (similar to GIN); pre-21c it was CLOB + function-based indexes.

CapabilityPostgres JSONBMySQL JSONOracle 21c+SQL Server
Re-parse on each queryNo (parsed once)NoNoYes (text-based)
Indexed without knowing fieldYes (GIN over everything)NoYes (search index)No
Indexed for specific fieldYes (expression index)Yes (generated column)YesYes (computed column)
Partial in-place updateLimited (rewrites whole value)Yes (5.7.22+)YesNo

Numbers: a containment query (WHERE payload @> '{...}') against a 50M-row JSONB table with a GIN index typically returns in 5–30ms. Without the index it's a 30-second sequential scan. Same query against MySQL JSON without a generated column = 30+ seconds; with the right generated column + index = ~10ms (but only for that specific extracted field).

JSONB's catch: updates rewrite the whole document. If you have a views counter inside a 50KB JSONB blob and you increment it 1000×/sec, you're writing 50MB/sec of WAL. Lift hot counters into their own columns or Redis. JSONB is for occasionally-mutated, frequently-read shapes — not for hot path partial updates.

⑥ Time-series — TimescaleDB & BRIN indexes

Time-series workloads have a very specific shape: data arrives in time order, queries are almost always range scans by time, you aggregate over windows, and old data is rarely updated. Every part of a general-purpose DB can be specialized around that shape.

BRIN indexes — tiny indexes for sorted data. A B-Tree index on a 100GB time-series table is typically 15–20GB. A BRIN (Block Range INdex) on the same column is ~5MB. BRIN works by storing, for each contiguous block range, just the min/max value of the indexed column. When the column is naturally sorted (like created_at in an append-only logs table), BRIN can skip 99% of pages with a tiny memory footprint. The trade-off: if data isn't sorted, BRIN is useless.

BRIN over a timestamp — 5 MB instead of 20 GB
CREATE INDEX idx_logs_ts ON logs USING BRIN (created_at) WITH (pages_per_range = 128);
-- Query: "last 1 hour of logs" scans < 1% of the table
SELECT * FROM logs WHERE created_at >= NOW() - INTERVAL '1 hour';

TimescaleDB — Postgres turned into a time-series DB. A Postgres extension that introduces hypertables: virtual tables that automatically partition by time into many child tables (chunks). The optimizer routes queries to relevant chunks only, parallelizes across them, and supports continuous aggregates (materialized rollups refreshed incrementally). Add native columnar compression (8–15× compression ratio) for old chunks and you have a serious time-series engine wearing Postgres clothing.

Why purpose-built time-series DBs still win at extreme scale. ClickHouse, InfluxDB, and QuestDB are built ground-up for this workload. ClickHouse routinely ingests millions of rows/sec on a single node and aggregates billions of rows in seconds via vectorized execution + columnar storage. For pure metrics at extreme scale, they outperform TimescaleDB. The trade-off: SQL dialect quirks, weaker transactional semantics, separate operational stack.

OptionStrengthWeakness
Postgres + BRINZero new dependencies, "just an index"No automatic partitioning, no columnar compression
TimescaleDBSQL-compatible, transactional, hypertables, compressionAdds an extension to operate
ClickHouseInsane aggregation speed, columnar, vectorizedSQL dialect, weak updates/deletes
InfluxDBPurpose-built API, retention policies, downsamplingStorage engine has churned across versions

Heuristic: if you're under ~1B rows / 1TB / 100K writes-per-sec, BRIN or TimescaleDB inside Postgres is the right answer — one database to operate. Beyond that, a dedicated TS engine pays for itself in ingestion throughput and query latency.

⑦ Embedded / on-device — SQLite

SQLite isn't competing with the others — it's a different category of database. Every characteristic that "limits" it (no server, no network, one writer at a time) is actually an asset for the workloads it owns.

No IPC means microsecond latency. A query in Postgres or MySQL is a network round-trip (even on localhost: serialize → kernel socket → deserialize → parse → execute → serialize back). A SQLite query is a function call into a C library — nanoseconds of overhead. For "open file, read 1000 rows, close" patterns (browser history, app preferences, IoT sensor log), SQLite is 10–100× faster than client/server DBs on the same hardware.

WAL mode + memory-mapped I/O. Default journal mode (rollback journal) serializes everything. WAL mode lets multiple readers proceed concurrently with one writer — and writers no longer block readers. Add PRAGMA mmap_size = 30GB and SQLite memory-maps the database file directly, bypassing kernel buffers. Reads become pointer dereferences. Insert throughput on a local NVMe with WAL + synchronous = NORMAL routinely hits 100K rows/sec.

SQLite production-tuning pragmas
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;     -- still crash-safe, fsync less often
PRAGMA cache_size = -100000;     -- 100 MB of page cache
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 30000000000;   -- 30 GB mmap window
PRAGMA foreign_keys = ON;
PRAGMA wal_autocheckpoint = 1000;

The one-writer ceiling. SQLite serializes writes globally — even in WAL mode. For a single-user app or a low-write web service, this never matters. For a multi-tenant SaaS doing 10K writes/sec, it's a hard wall. Litestream and LiteFS work around it for some patterns (single-writer + replicated readers), and "one SQLite database per tenant" is a legitimate architecture used by Notion-style products — but at that point you're sharding manually.

Numbers: on a MacBook Pro NVMe with WAL: ~100K simple inserts/sec, ~500K simple reads/sec, query latency consistently under 10µs. No setup, no daemon, no port, no auth. Just a file.

"SQLite is for toy projects" is a myth that won't die. The Linux kernel, every iOS app, every Android app, every Chrome and Firefox install, Tailscale's control plane, Expensify's transaction system — all SQLite. It's the most deployed database in human history by orders of magnitude. Know its constraints (one writer; no network) and respect them, and it scales further than people expect.

⑧ Geospatial — Postgres + PostGIS

If you're doing anything with maps, locations, polygons, routing, or "find everything within 5km of here", PostGIS is the answer. It's the only choice with the depth of features and ecosystem maturity for serious geospatial work — and it's been the answer for 20+ years.

R-tree indexing via GiST. A B-Tree can't index 2D points or polygons (there's no total ordering of shapes in 2D space). PostGIS uses Postgres's GiST (Generalized Search Tree) infrastructure to build R-tree indexes that bound geometries in rectangles, then recursively bound those rectangles. A "find everything within this polygon" query becomes a tree pruning operation that visits a tiny fraction of the table.

The function library is exhaustive. PostGIS implements the full OGC Simple Features SQL standard (300+ ST_* functions): distance, intersection, union, buffer, convex hull, nearest-neighbor, raster operations, network analysis (via the pgRouting extension). Anything you can express on a 2D plane or a sphere, there's a function for it.

PostGIS — restaurants within 1km of a user, ordered by distance
CREATE EXTENSION postgis;

CREATE TABLE restaurants (
  id   BIGSERIAL PRIMARY KEY,
  name TEXT,
  geom GEOGRAPHY(POINT, 4326)   -- WGS84 lat/lng
);
CREATE INDEX idx_rest_geom ON restaurants USING GIST (geom);

-- Find restaurants within 1km of Bangalore, sorted by distance
SELECT id, name, ST_Distance(geom, user_loc) AS meters
FROM restaurants,
     (SELECT ST_MakePoint(77.5946, 12.9716)::GEOGRAPHY AS user_loc) u
WHERE ST_DWithin(geom, user_loc, 1000)
ORDER BY geom <-> user_loc          -- kNN operator — uses GIST index
LIMIT 20;

Alternatives, briefly:

  • Oracle Spatial — comparable depth, proprietary, expensive. Used in defense/government.
  • SQL Server geometry/geography — solid but smaller function library; well integrated with Power BI.
  • MySQL spatial types — basic; lacks the depth of PostGIS function library; geographic (vs Cartesian) computations are limited.
  • SQLite + Spatialite — bring PostGIS-like spatial functions to SQLite; great for offline maps in apps.
  • MongoDB 2dsphere — adequate for point-in-radius queries; not a real geospatial DB beyond that.

Numbers: PostGIS routinely handles 100M+ polygon tables (parcel data, country boundaries, road networks) on commodity hardware. Spatial joins between two such tables complete in seconds with proper GiST indexes. Uber, Airbnb, Strava, and every mapping startup of the last 15 years has had PostGIS somewhere in the stack.

Numbers people quote (rough, single-node, well-tuned)

  • Postgres 16 — 50K–100K TPS for OLTP; pgbench routinely shows 100K+ TPS on modern hardware. Complex analytics 5–10× MySQL.
  • MySQL 8 — 200K+ QPS for simple PK reads; clustered index + AHI give it the edge for that pattern. Loses ground as queries get complex.
  • SQLite — 100K+ inserts/sec on local NVMe with WAL mode; single-writer ceiling.
  • MyRocks — 3–5× InnoDB write throughput on random-key workloads with SSDs; ~50% space saving.
  • Oracle / SQL Server — depend entirely on hardware and edition. They scale further vertically (NUMA-aware, parallel everything) than open-source counterparts; In-Memory / Columnstore tilt HTAP workloads decisively in their favor.
  • TimescaleDB — 1M+ time-series rows/sec ingest on a single node; queries on compressed chunks 10–100× the equivalent on vanilla Postgres tables.
If a benchmark says "X is 3× faster than Y", check the workload. Single-row lookups, OLAP queries, write-heavy, JSON-heavy, time-series, geospatial — each has a completely different winner. A blog-post benchmark rarely matches your workload. Always benchmark your top 5 queries on representative data sizes.
There's no universal "fastest" relational database. MySQL InnoDB wins simple PK reads via clustered index + AHI. Postgres wins complex queries, JSON, full-text, geospatial, and modern features. Oracle and SQL Server win HTAP (mixed OLTP + analytics) via resource governors and columnstore. SQLite wins embedded and on-device. MyRocks wins random-write workloads. TimescaleDB wins moderate-scale time-series. Match the engine to the workload — never the other way round.
15

SQLite — The Relational DB You Use Every Day Without Noticing

SQLite is the most widely deployed database engine in the world. It runs in every iPhone, every Android, every Chrome and Firefox install, on every airplane (literally — it's in Airbus avionics). Not a server. Not a "lightweight version" of anything. A different category of relational DB entirely.

What makes it different

  • No server process. A C library you link in. The "database" is a single file. To connect, you fopen() that file.
  • Serverless (the original meaning). No network protocol, no auth, no port. Your process talks to the file directly via the library.
  • Whole-file locking — only one writer at a time. Multiple readers under WAL mode.
  • Dynamic typing — columns have "type affinity" not strict types. You can put a string into an INTEGER column (it'll store as text). This is unique and divisive.
  • Public domain. Use anywhere, commercial or not, no attribution. Hipp does not believe in licenses.

When SQLite is the right answer

  • Embedded apps — phones, IoT, browser local state, desktop apps.
  • Low-concurrency websites — Litestream/LiteFS now make SQLite viable for production web apps (Tailscale, Pieter Levels' projects).
  • Test fixtures — spin up an in-memory DB in 5ms, run tests, throw it away.
  • File format — Apple, Adobe, and dozens of others use SQLite as a structured file format. Better than inventing your own.
  • Analytics on local files — modern SQLite + DuckDB-style analytics is great for desktop data crunching.

When it's not

  • High write concurrency — only one writer at a time.
  • Network access — you can't connect across machines (technically you can via NFS, but you shouldn't).
  • Fine-grained auth — there's no user system.
SQLite — production-grade in WAL mode
PRAGMA journal_mode = WAL;       -- writers don't block readers
PRAGMA synchronous = NORMAL;     -- fsync less often, still durable on crash
PRAGMA foreign_keys = ON;        -- off by default, surprisingly
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 30000000000;   -- memory-mapped I/O
SQLite's "type affinity" means CREATE TABLE t (n INTEGER) happily accepts INSERT INTO t VALUES ('hello'). The 'hello' is stored as TEXT in that column. Use the STRICT table flag (added 3.37) to enforce real types.
SQLite isn't a "small Postgres" — it's a different kind of tool. Embedded, file-format, serverless. With WAL mode and Litestream, it now powers real production websites. Don't dismiss it.
16

MariaDB vs MySQL — The Fork That Stuck

Oracle bought Sun in 2010, inheriting MySQL. Monty Widenius (MySQL's original author) didn't trust Oracle and forked MySQL into MariaDB. Fifteen years later, MariaDB is everywhere — but the projects have diverged enough that "they're the same" is no longer true.

FeatureMySQL 8MariaDB 11
Default storage engineInnoDBInnoDB (their own fork) + Aria for system tables
JSONBinary native JSON typeJSON is an alias for LONGTEXT with helper functions
UUID typeFunction only (8.0.13+)Native UUID column type (10.7+)
SequencesYes (Oracle-compatible CREATE SEQUENCE)
System-versioned tablesYes — full SQL:2011 temporal tables
Oracle compatibility modeYes (PL/SQL subset)
Group replication / clusteringGroup Replication, InnoDB ClusterGalera Cluster (sync multi-master)
LicenseGPL / commercial dualGPL only
Roadmap driverOracle CorporationMariaDB Foundation (non-profit)
"MariaDB is a drop-in replacement for MySQL" was true around MariaDB 5.5 / 10.0. It is no longer true at MariaDB 11 / MySQL 8. JSON behavior differs, replication protocol changes have appeared, and authentication plugins differ. Always test a migration; never assume.
If you started on MySQL pre-2015 and want to escape Oracle's ownership, MariaDB is a natural path — Galera clustering and sequences are real wins. If you're starting fresh in 2026, MySQL 8 is more widely deployed and its tooling ecosystem is bigger. Most cloud providers offer both.
17

SQL Server & Oracle — Where Enterprise Workloads Live

Open-source DBs dominate web. Proprietary enterprise DBs dominate banks, telcos, ERPs, governments. The features that justify their license fees are less about raw speed and more about completeness — every analytics tool, every compliance feature, every migration path is bundled.

SQL Server — Microsoft's Swiss-army knife

  • T-SQL — the procedural language is genuinely powerful (window functions, recursive CTEs, table-valued parameters).
  • SSAS / SSIS / SSRS — bundled OLAP cubes, ETL, and reporting.
  • Columnstore indexes — game-changing for data-warehouse workloads.
  • Always On AGs — HA story is one of the smoothest.
  • Linux support since 2017 — used to be Windows-only, now happily on Linux containers.
  • Editions — Express (free, 10GB), Developer (free, full features for non-prod), Standard, Enterprise (very expensive).

Oracle — The big iron

  • RAC (Real Application Clusters) — multiple instances share one storage layer. Active-active. The only way to "scale up" a single Oracle DB across machines without rewriting your app.
  • PL/SQL — most powerful procedural language of any RDBMS.
  • Partitioning — best-in-class.
  • Data Guard + GoldenGate — physical and logical replication respectively, both mature.
  • Exadata — Oracle's purpose-built hardware/software stack. Pricey, fast, all-in-one.
  • 23ai — adds native vector search, property graph, JSON-relational duality. Oracle catching up on modern features.
  • License complexity — per-core, per-named-user, per-processor with multipliers; LMS audits are legendary.

Why anyone still pays

  • 30-year-old codebases written in PL/SQL or T-SQL. Re-platforming is a $10M+ project.
  • Compliance certifications (HIPAA, FedRAMP at the highest levels) come pre-built.
  • Vendor support contracts — when production is down, having a paid escalation path matters.
  • Specific features no open-source DB matches yet (RAC for true shared-storage multi-master, Exadata Smart Scan, Oracle Text linguistic features).
SQL Server is for shops standardized on Microsoft. Oracle is for legacy enterprise stacks where migration costs outweigh license fees. Both have legitimate niches; neither is the right choice for a new startup in 2026 unless you have a very specific reason.
18

NewSQL — CockroachDB, Spanner, TiDB, YugabyteDB

Postgres scales vertically. Sharding scales horizontally but breaks transactions. NewSQL is the answer to "can we have SQL + ACID + horizontal scale, please?" — and as of 2026, the answer is "yes, with some caveats."

The contenders

DBWire protocolConsensusNotes
Google SpannerCustom SQL dialectPaxos + TrueTime atomic clocksMulti-region external consistency. Cloud-only.
CockroachDBPostgres wire protocolRaft per range"Survive anything." Open source until BSL change in 2024.
TiDBMySQL wire protocolRaft + PercolatorHTAP — separate TiKV (OLTP) + TiFlash (columnar OLAP).
YugabyteDBPostgres-compatible (YSQL) + Cassandra-compatible (YCQL)RaftOpen source. Postgres-derived query layer.
VitessMySQL wire protocolCustom (uses MySQL replicas)Sharding layer, not a new storage engine. Powers YouTube.
Citus (PG ext)PostgresStandard PG replicationDistributed Postgres tables via co-location. Microsoft Azure.

What you give up

  • Single-node latency. A simple insert that's 0.1ms in single-node Postgres becomes 2–5ms in CockroachDB (consensus round-trip).
  • Full feature parity. Stored procedures, exotic data types, some indexes — usually a subset of what their wire-compatible parent supports.
  • Operational simplicity. Now you're running a distributed system. Backups, restores, schema changes are different beasts.

What you gain

  • Geographic distribution with strong consistency. A row in Mumbai stays consistent with reads in San Francisco.
  • Linear scale-out. Add a node, get more capacity. No sharding decisions.
  • Survive failures. Lose a node (or a whole region) — no downtime.
  • One database, many regions. Compliance scenarios where data residency matters.
Postgres = one excellent restaurant. Add chefs (CPU), bigger kitchen (RAM) — but only one location. NewSQL = a restaurant chain with synchronized menus. Every branch agrees on what's served. More overhead per dish (consensus), but you can open a branch in any city, and any branch closing doesn't shut the chain.
If you fit on one Postgres, use one Postgres. If you need multi-region strong consistency or true horizontal scale without sharding pain, NewSQL is genuinely the answer in 2026. CockroachDB and YugabyteDB are the realistic open-source-ish options; Spanner is the cloud-managed gold standard.
19

Migration Pitfalls — Moving From One to Another

"Just port the SQL" is the optimist's plan. Real migrations break on the dialect differences, the data-type coercions, the procedural code, and the operational habits.

The classic landmines

  • Quoting and case sensitivity. Postgres folds unquoted identifiers to lowercase; MySQL is case-sensitive on Linux, case-insensitive on Mac/Windows. Code that "works locally" breaks in prod.
  • Auto-increment semantics. MySQL's AUTO_INCREMENT never reuses values; Postgres sequences can have gaps. Code that depends on contiguous IDs breaks.
  • Empty string vs NULL. Oracle treats '' as NULL; Postgres and MySQL treat them as distinct. Migrations Oracle → anything else surface this immediately.
  • Date/time semantics. MySQL stores TIMESTAMP in UTC and converts on read; Postgres's TIMESTAMPTZ is similar but the conversion rules differ. DATETIME in MySQL has no timezone at all.
  • Boolean type. Postgres has it; MySQL fakes it with TINYINT; Oracle has none. Cross-DB ORMs handle this differently.
  • NULL ordering. Postgres: NULLs LAST in ASC by default. MySQL: NULLs FIRST. ORDER BY x ASC NULLS LAST works on Postgres only.
  • Concurrency hot-spots. A pattern that survives in MySQL InnoDB (auto-increment hot-spot) may underperform in Postgres (sequence cache contention) — and vice versa.
  • Stored procedures. See section 9 — basically rewrite from scratch.

What a real migration looks like

  1. Schema port — types, constraints, indexes (1–2 days for small, weeks for large).
  2. SQL port — replace dialect-specific syntax. Tools: pgloader, Ora2Pg, AWS SCT.
  3. Procedural code rewrite — the long pole. Often easier to move to application code.
  4. Data move — bulk load + ongoing CDC until cutover.
  5. Cutover with rollback plan — usually a freeze window.
  6. Operational re-learning — monitoring, backups, alerting all need re-instrumentation.
Migrations that fail rarely fail at the database level. They fail because the team underestimates how much business logic lives in stored procedures, triggers, and unspoken behaviors of the old DB. Audit those first, not last.
SQL is 80% portable, the DB itself is 20% portable. Plan for the 20%: dialect, procedures, ops. Use tooling (pgloader, Ora2Pg) for mechanical work. Budget more time for the cultural shift than the technical one.
20

Choosing Cheat Sheet & Rapid-Fire Q&A

Distilled. If you remember nothing else from this page, remember this.

Picking the database — by what you're building

What you're buildingDefault pickWhy
Startup MVP, anything in 2026PostgresFeatures, JSON, vectors, FTS, modern SQL — out of the box
High-volume read-heavy web appMySQL 8 or PostgresBoth handle it; pick the one your team knows
Mobile / desktop / embeddedSQLiteNo server, single file, ubiquitous
Enterprise with existing PL/SQLOracleMigrating off is a years-long project
Shop standardized on .NET / MicrosoftSQL ServerTooling and integration win it
Need multi-region strong consistencyCockroachDB / SpannerNewSQL solves this; nothing else does
Time-series at scaleTimescaleDB or InfluxDBTS-specific compression and partitioning
Geospatial-heavyPostgres + PostGISThe reference implementation
Existing MySQL but want better governanceMariaDB or stay on MySQL 8Both are fine; differences are subtle
In one sentence, when should I pick Postgres over MySQL in 2026?

Default to Postgres unless you have a specific reason not to — it has caught up on read performance, surpassed on features (JSON, FTS, vectors, types, transactional DDL), and the operational tooling is now first-class on every cloud.

Why does anyone still use Oracle?

Existing code and existing contracts. PL/SQL codebases, certified compliance bundles, mission-critical apps with seven-figure replacement costs. New greenfield projects rarely pick Oracle unless they're inside an Oracle-shop already.

Is MySQL still relevant in 2026?

Yes. It powers a huge slice of the internet (Booking, Uber, Slack via Vitess, GitHub, WordPress sites). It's faster than Postgres at the simplest read patterns and the ecosystem is enormous. The "Postgres is universally better" narrative is overstated — both are excellent.

What's the single biggest behavioral difference I'd notice porting from MySQL to Postgres?

Default isolation. MySQL = REPEATABLE READ (with phantom prevention). Postgres = READ COMMITTED (looser). Code that quietly relied on RR semantics will race on Postgres unless you also set SET TRANSACTION ISOLATION LEVEL REPEATABLE READ.

When would I pick Cockroach over Postgres?

Multi-region, strong-consistency requirement; or you genuinely outgrew a single beefy Postgres (rare — modern hardware pushes that ceiling very high). For most apps, vertical-scaled Postgres + read replicas beats CockroachDB.

Are stored procedures still a good idea?

Rarely. They lock you to one DB, are hard to test and version, and modern app servers are fast enough that the round-trip cost is negligible. Use them only when truly latency-critical (set-based operations on millions of rows) or for security boundaries.

SQLite for production websites — is that a real thing?

Yes, with caveats. Combined with WAL mode + Litestream/LiteFS for replication, SQLite powers production sites at moderate scale (Tailscale's coordination plane, Pieter Levels' projects, Expensify's internal services). One-writer-at-a-time is the ceiling.

What's the most overrated database feature?

"Schemaless" / no-migration-needed. Every long-lived schema converges on a contract — the only question is whether the DB enforces it or your app does. Migrations are not the problem; pretending you don't need them is.

What's the most underrated relational feature?

CHECK constraints and partial indexes. A CHECK (status IN ('open','closed','cancelled')) or a CREATE INDEX ... WHERE status = 'open' can replace whole classes of application bugs and queries. Free correctness/performance. Use them.

If I had to learn one DB deeply in 2026 — which?

Postgres. It's the database with the most feature breadth, the most active development, the best modern ecosystem (pgvector, TimescaleDB, Citus, PostGIS, pgRouting, FDW), and the lowest cost-of-knowledge — your skills transfer to managed cloud services on AWS, GCP, Azure, and dozens of hosted providers.

There is no single "best relational database". There's the one whose access patterns, scale ceiling, operational model, and ecosystem match your project. In 2026, Postgres is the safest default for greenfield work; MySQL is fine if you're already there; SQLite is undersold for embedded and low-concurrency web; Oracle and SQL Server are enterprise legacies; NewSQL solves problems most teams don't have yet but might tomorrow.