← Back to home

Design & Development

A curated set of deep dives on system design, low-level design, and backend engineering — written for interview prep, case studies, and battle-tested production patterns.

Deep Dives
5 Stack Sections
Topics
0 Visited by You
🔍 /

01High-Level Design (HLD)

Architecture-level system design — the 4-step interview framework, a reference catalog of patterns & technologies, and 24 worked case studies covering every famous "design X" prompt with capacity math, trade-offs, and production-grade detail.

Case Studies

HLD · Security

OTP Validation System — HLD Design

Production-ready OTP architecture: Redis + Postgres split, bcrypt hashing, 5-layer brute-force defense, per-use-case TTLs, replay protection, and every mistake to avoid.

HLD · Case Study

WhatsApp — HLD Design

End-to-end encrypted messaging at 2B-user scale — Signal Protocol's X3DH + Double Ratchet, multi-device fan-out (server can't read your messages), group sender keys, WebRTC for voice/video with TURN fallback, and the connection-server cluster holding 50K WebSockets each.

HLD · Case Study

Top-K Service — HLD Design

Trending Now from 100K events/sec — Count-Min Sketch + Min-Heap for approximate top-K in fixed memory regardless of cardinality, sliding windows via per-minute sketch buckets, Flink-driven aggregation, and the trade-off vs exact counting.

HLD · Case Study

Ad Click Aggregator — HLD Design

1M clicks/sec where every click is billed — exactly-once aggregation via Kafka + Flink with idempotency keys, hot-ad sub-sharding, real-time Redis dashboards alongside daily Spark reconciliation against payment networks, fraud detection, and PCI-style audit.

HLD · Case Study

Google Docs — HLD Design

Real-time collaborative editing with sub-200ms keystroke latency — Operational Transformation for conflict resolution, Doc Session Server holding the canonical op log per document, offline merge on reconnect, op-log + snapshot persistence, and revision history reconstruction.

HLD · Case Study

Dream11 — Fantasy Sports at IPL Scale

10M concurrent users at toss, 500K team-saves/sec, 12M-entry Mega Contest leaderboards refreshed every 5s — the three-plane split (transaction / stream / real-time), and the reasoning behind every tech pick: Postgres vs MySQL vs Oracle, Aerospike vs Redis for wallet, Cassandra vs DynamoDB, Redis ZSET vs SQL ORDER BY, ALB vs NLB vs HAProxy, Flink vs Spark Streaming.

HLD · Case Study

Payment System — HLD Design

Move money without losing or duplicating a paisa — idempotency-key + DB UNIQUE for safe retries, double-entry ledger with sum-to-zero invariants, Temporal saga with compensating actions for multi-step rollback, fraud scoring, RBI-compliant tokenization, and a Razorpay-style India context (UPI, 3DS, NEFT/IMPS settlement) to keep raw card numbers out of your servers.

HLD + LLD · Payments & Security

Card Tokenization at 50K TPS — HLD + LLD

Build a PCI-DSS-grade vault that swaps 16-digit PANs for opaque tokens 50,000 times a second — envelope encryption (DEK per record, KEK in HSM), random vs HMAC vs FPE tokens, vault/metadata DB split to shrink PCI scope, idempotent tokenize with Redis SET NX, audit firehose via Kafka + ClickHouse, GDPR crypto-shredding, and the full Java class diagram + Builder/Strategy/Facade code for the interview whiteboard.

HLD · Case Study

Distributed Cache — HLD Design

Build Redis Cluster from first principles — consistent-hashing ring with virtual nodes, sync vs async replication, LRU/LFU/TTL eviction trade-offs, write-through vs write-around vs write-back, hot-key splitting, cache-stampede protection via request coalescing, and Sentinel-driven failover.

HLD · Case Study

Job Scheduler — HLD Design

"Send reminder in 3 days" at billions-of-jobs scale — time-bucketed storage so finding due jobs is O(1), two-tier cold-DynamoDB + hot-Redis-ZSET split, leased execution for crash safety, ZooKeeper-driven partition assignment, and exactly-once-effective via idempotency keys.

HLD · Case Study

Pastebin — HLD Design

The metadata-vs-blob plane split that makes a paste service scale — KGS-backed unique keys, S3 for content, MySQL/Cassandra for metadata, multi-tier cache, and the 5:1 read:write ratio that drives every choice.

HLD · Case Study

Instagram — HLD Design

Three planes — Upload, Serve, Feed — with photo sharding by photo_id, fan-out vs fan-in for news feed, and the celebrity-user hybrid that makes 100M followers tractable. 1425TB of blobs over 10 years, plus the exact ER diagram + capacity math.

HLD · Case Study

Facebook Messenger — HLD Design

From polling to WebSockets — chat servers each holding 50K open connections, HBase for time-sorted message storage, Kafka for cross-server fan-out, presence service that scales to 500M users without broadcast storms.

HLD · Case Study

Twitter — HLD Design

325K reads/sec timeline assembly via the push/pull hybrid — fan-out-on-write for normal users, pull for celebrities like Elon, merge on read. 64-bit time-sortable tweet IDs, Cassandra sharded by tweet_id, plus trending topics & who-to-follow.

HLD · Case Study

Youtube / Netflix — HLD Design

Why CDN is non-negotiable for video — three-plane architecture (Upload, Transcode, Serve) with adaptive bitrate streaming, perceptual-hash dedup, and the 1TB/sec egress problem solved by edge POPs. 25GB/sec ingest math + transcode pipeline.

HLD · Case Study

Typeahead Suggestion — HLD Design

In-memory trie with offline frequency updates — why SQL LIKE can't do 60K QPS in under 200ms, how partition-by-hash + aggregator merge handles "give me top 10 for prefix sy", and the EMA-based ranking that makes trends bubble up.

HLD · Case Study

API Rate Limiter — HLD Design

Five algorithms compared (fixed window, sliding window, sliding-with-counters, token bucket, leaky bucket), the atomic-INCR race condition solved with Redis Lua, and the IP-vs-user-vs-API-key hybrid that protects /login without locking real users out.

HLD · Case Study

Twitter Search — HLD Design

Inverted index over 730 billion tweets in under 200ms — shard-by-tweet_id with aggregator fan-in, per-shard local indexes, the reverse-index trick that makes crash recovery fast, and the ranking pipeline that scores recency + popularity + engagement.

HLD · Case Study

Web Crawler — HLD Design

15B pages in 4 weeks at 6200 pages/sec — sharded URL frontier with per-host politeness queues, robots.txt cache, document & URL dedupe via SHA checksums (no bloom filters!), checkpointing for week-long crawl resilience, and crawler-trap defenses.

HLD · Case Study

Facebook Newsfeed — HLD Design

Pre-computed personalized feeds for 300M DAU — the push/pull/hybrid trade-off, ML ranking by relevance + recency + engagement, multi-tier cache (in-process → Redis → DB), and how new posts hit followers' feeds within 5 seconds.

HLD · Case Study

Yelp / Nearby Friends — HLD Design

QuadTree spatial index for "find me ramen within 1 mile" across 500M places — why fixed grids fail in Manhattan, how dynamic 4-way splits keep leaf density uniform, doubly-linked leaves for fast neighbor traversal, and the QuadTree-Index reverse map for crash recovery.

HLD · Case Study

Uber Backend — HLD Design

167K driver location updates/sec without melting the QuadTree — the DriverLocationHT in-memory hash table that absorbs the firehose, lazy 15-second QuadTree refresh, and the Notification Service pub/sub that pushes live driver positions to subscribed riders.

HLD · Case Study

Ticketmaster / BookMyShow — HLD Design

50K fans hitting "buy" on the same 200 seats at 09:00:00.001 — SERIALIZABLE isolation + SELECT FOR UPDATE to prevent double-bookings, ActiveReservationsService with 5-min holds, and the WaitingUsersService FIFO queue that wakes the next buyer when a hold expires.

HLD · Case Study

TinyURL / Short URL Service — HLD Design

Three-pass story arc from "one MySQL box" to a sharded NoSQL store fronted by a CDN, Memcached, and a Key Generation Service that pre-generates 6-char keys offline so the write path never collides — write/read/key-gen split, 20K redirects/sec, full capacity math.

HLD · Case Study

Dropbox / Google Drive — HLD Design

File sync at planet scale: 4 MB chunking, in-line dedup, presigned-URL uploads, sharded metadata, long-poll notification fabric, conflict resolution, and the data-vs-control-plane split that makes it all work.

HLD · Case Study

LeetCode / Online Judge — HLD Design

Running strangers' code without setting the host on fire: gVisor/Firecracker sandboxes, async submission queue, WebSocket verdict push, Redis ZSET leaderboards, and the web-tier vs. judge-tier split that makes 5K hostile submissions/sec routine.

02Databases

Storage choices decide every other box in the diagram. Family-by-family comparisons, deep dives into specific engines, and the honest "stay on Postgres" decision tree — everything you need to defend a DB pick in design review or an interview.

03Low-Level Design (LLD)

Object-modeling and class-design problems — first the framework that fits any 45-minute LLD round, then eight worked case studies with full Java code, state machines, and the gotchas Grokking glosses over.

Case Studies

LLD · Case Study

Vending Machine — LLD Design

End-to-end object model, state machine, and class design for the classic LLD question — from requirements to code, with trade-off discussions at each step.

LLD · Case Study

Parking Lot — LLD Design

A multi-floor smart parking lot built from a paper-ticket booth — Singleton + Strategy + State + Observer, full Java code, lost-ticket flow, peak-hour pricing decorator, and every gap in Grokking's classic answer plugged.

LLD · Case Study

ATM System — LLD Design

A real ATM — 8-state machine for the device, Strategy for transactions, Chain of Responsibility for cash dispensing in mixed denominations, 2-phase debit so a jam never costs the customer, and full Java code with every Grokking gap plugged.

LLD · Case Study

Library Management — LLD Design

A real lending library — Repository search (not HashMap), FIFO ReservationQueue per book, BookItem state machine, Observer-based notifications, FineStrategy per member-type, and Grokking's classic single-slot reservation bug fixed.

LLD · Case Study

Movie Ticket Booking — LLD Design

A real BookMyShow-style booking — atomic seat-locking with TTL, Saga payment flow with compensations, decorator pricing for peak-hour + premium-seat surcharges, and the concurrency story Grokking glosses over (three users clicking F-12 at the same instant).

LLD · Case Study

Notification Service — LLD Design

Multi-channel sender (Email/SMS/Push/WhatsApp) with the API-vs-worker split that keeps Order Service unblocked when SendGrid is slow — Strategy + Factory + Decorator + retries with backoff, idempotency at two layers, and rate limits per user & provider.

LLD · Case Study

MakeMyTrip — LLD Interview Design

Full interview-grade design: requirements, entities, 10 design patterns mapped to real variability, booking state machine, concurrency, and 15 follow-up cross-questions with model answers.

LLD · Case Study

Hotel Management — LLD Design

End-to-end hotel system design: actors & use cases, class/ER diagrams, full MySQL DDL with composite indexes, booking concurrency with pessimistic locks, sequence flows, and interview Q&A.

05Backend & Tech Stack

The day-to-day toolkit of a backend engineer — frameworks, message brokers, databases, containers, browser storage. Each guide is interview-grade with the production gotchas baked in.

Backend Fundamentals

Backend · Quick Reference

Backend Fundamentals — DNS, HTTP, Docker, Circuit Breaker

How DNS resolves & caches, the 9 HTTP methods, Dockerfile anatomy, VM vs container deployment, API Gateway vs Load Balancer, and circuit breaker interview Q&A.

Distributed Systems · Deep Dive

Caching Strategies in Distributed Systems

Every cache pattern in production — Cache-Aside, Read/Write-Through, Write-Behind, Refresh-Ahead — plus eviction policies (LRU, LFU, W-TinyLFU), consistent hashing, the five famous failure modes (stampede, avalanche, penetration, hot key, big key), and a real 10-component cache architecture walked through end-to-end.

Java vs Go · Interview Perspective

Java vs Go — Why Java Still Wins in 2026

A short, interview-friendly take on why Java is still the better backend choice in 2026. Virtual threads (Java 21+) match goroutines, the ecosystem is unmatched, the JVM beats AOT at steady state — plus a clean template answer for "Java or Go?" in interviews.

High Availability · Deep Dive

Stop the Load Balancer Being a Single Point of Failure

A load balancer exists to kill single points of failure — so what happens when it becomes one? Built up failure-by-failure: redundant active-passive pairs with VRRP floating IPs, active-active via ECMP & BGP anycast, health-checked GSLB above, the split-brain trap, a failover traced second-by-second, and how AWS/GCP managed LBs hide (most of) it.

Backend Interview · Q&A

Backend & System Design — Q&A Walkthrough

Story-driven answers to the questions that show up in senior backend rounds — architecture walkthroughs, monolith vs microservices, event-driven flow, sagas & idempotency, observability, SQL/NoSQL, Node.js event loop, resilience patterns (circuit breaker, DLQ, bulkhead), and the "draw your project" follow-ups.

intervue.io Question Bank · Q&A

intervue.io Java Backend — Every Question, Answered

The full bank of questions from the intervue.io question bank — put to a candidate interviewing for a Java application role — deduplicated and grouped by skill: write-through vs write-behind & cache stampede, Spring transactions & AOP self-invocation, circuit breakers/saga/CQRS, the transactional outbox & Kafka ordering, LRU/idempotency/back-pressure coding, query tuning & sharding-vs-replicas, plus stakeholder communication. Each answered the way you'd say it in the room.

Tech Lead Interview · Solutions

Tech Lead Interview — Solutions & Talking Points

The full tech-lead loop answered from the lead's chair — system design under judgment (clarify, trade-offs, what breaks first), architecture calls (monolith vs micro, tech debt, build/buy, definition of done), and the people half (mentoring, conflict & disagree-and-commit, delivery, incidents, culture) — every answer STAR-structured with a "what it probes" note.