Kafka Interview Questions — A Storytelling Guide

What Kafka Actually Is — Distributed Append-Only Log, Not a Message Queue

Almost every Kafka interview opens with the misconception that Kafka is "just another queue like RabbitMQ". It isn't. Kafka is a distributed, partitioned, replicated commit log with consumers that track their own position. That single architectural choice changes everything downstream.

Riya joins a fintech that runs on RabbitMQ. Each consumer "takes" a message off the queue, the broker deletes it after ack. She switches to Kafka and asks: "where do I delete the message after I process it?" Her tech-lead smiles — "you don't. The broker keeps it for 7 days, and you just remember where you stopped reading."

Is Kafka a message queue? How is it different from RabbitMQ or SQS?

Kafka is a distributed append-only log. Producers append records to a topic; the broker keeps every record on disk for a configured retention period (default 7 days). Multiple consumer groups can read the same topic independently, each tracking its own offset. Nothing is "consumed" or deleted on read.

RabbitMQ / SQS are queues: a message is routed to a consumer, acked, and removed. Once it's gone, no other consumer sees it.

Property	Kafka	RabbitMQ / SQS
Data model	Append-only log	Queue / exchange
Storage	On disk, retained for days	In memory until acked
Multi-consumer	Each group reads independently	Each message goes to one consumer
Ordering	Per partition	FIFO per queue (varies)
Replay	Reset offset, re-read	Not natively supported
Throughput	Millions msg/sec	Tens of thousands

RabbitMQ is a letter being delivered — once Aman opens it, it's gone. Kafka is a newspaper printed daily — Aman, Riya, and Karthik can each read the same Tuesday issue, in their own time, at their own pace.

Lead with "append-only log" and "consumers track their own offset". That single sentence shows you understand the architecture, not the API surface.

Kafka isn't a queue — it's a durable, replayable log. Multiple readers, retained on disk, replay-friendly. Every other Kafka concept (consumer groups, offsets, log compaction, replay) flows from this design.

Topics, Partitions & Offsets — The Three-Level Address

Every Kafka record has a three-part address: topic, partition, offset. If you can explain why all three are needed and what each one buys you, you've already cleared the first 5 minutes of any Kafka interview.

What are topics and partitions in Kafka? Why partition at all?

Topic — a named stream of records. Logical grouping (e.g., orders, payments).
Partition — an ordered, immutable sequence of records inside a topic. A topic has N partitions; each partition lives on a broker. Partitioning is what makes Kafka horizontally scalable.
Offset — a monotonically increasing 64-bit ID assigned to each record within a partition. Once written, the offset never changes.

Records are ordered within a partition, never across partitions. So if order matters (a user's events, a single order's lifecycle), they must all land on the same partition — usually by using the same key.

Mental model

Topic: orders   (4 partitions, replication factor 3)

P0:  [m0][m1][m2][m3]...           offsets 0..3
P1:  [m0][m1][m2]...               offsets 0..2
P2:  [m0][m1][m2][m3][m4][m5]...   offsets 0..5
P3:  [m0][m1]...                   offsets 0..1

producer.send(new ProducerRecord("orders", "order-42", payload))
   // hash("order-42") % 4  →  partition 2
   // → broker that owns P2 → appended at next offset

"More partitions = more parallelism" is true — but only up to a point. Each partition is an open file handle on the broker, leader election state on the controller, and a fetch loop on every consumer. 100K partitions per cluster is a tuning problem; 1M is a horror story.

Topic = stream. Partition = the unit of parallelism & ordering. Offset = the cursor. Kafka scales by adding partitions; ordering only holds within one.

The Producer — What Actually Happens When You Call `send()`

Most candidates think producer.send(record) sends one message over the network. It doesn't. There's a serializer, a partitioner, an in-memory buffer, a batching loop, an I/O thread, and a retry handler — all between your call and the broker. Knowing the path is what separates a junior from a senior.

Aman complains: "I send 10,000 events a second but my Kafka throughput is only 2K." Tech-lead checks the producer config: linger.ms=0, batch.size=16384. "You're sending one message per request — there's no batching. Set linger to 5ms and watch throughput jump 10x."

Walk me through what happens between producer.send() and the message hitting the broker.

Serialize the key and value (StringSerializer, AvroSerializer, etc.).
Partition — if you set a key, partition = murmur2(key) % numPartitions. No key → sticky partitioner picks a partition and sticks with it for the current batch.
Append to in-memory record accumulator — one queue per partition. The send() call returns a Future immediately.
I/O thread (Sender) drains accumulator batches when either: batch.size bytes filled, or linger.ms elapsed since the first record arrived.
Send to leader broker. The broker writes to the partition log, replicates per acks, returns ack.
If the request fails (timeout, NotLeaderForPartition), the producer retries per retries + retry.backoff.ms.

Configs that matter

linger.ms=5            # wait up to 5ms to batch — single biggest throughput knob
batch.size=65536      # max batch bytes per partition (64KB is a sane default)
compression.type=lz4  # compress the batch — 3-5x throughput gain over none
acks=all              # durability — see section 4
enable.idempotence=true   # de-dup retries — see section 10
max.in.flight.requests.per.connection=5

If asked "how would you increase producer throughput?", the answers in order are: bigger batches (linger/batch.size), compression (lz4 or zstd), more partitions, async send (don't block on the Future).

send() is non-blocking, returns a Future, and joins a batch. The I/O thread sends batches when full or after linger.ms. Throughput is mostly tuned by linger.ms, batch.size, and compression.type.

acks=0, 1, all — The Durability Knob Most Devs Get Wrong

Three settings, three very different durability stories. The interviewer wants to know not just what each does, but the failure scenario each one allows.

What does acks control? Which would you use for a payment service?

acks	What gets confirmed	You can lose data when...	Latency
`0`	Nothing — fire and forget	Producer's network blip, broker crash, leader rebalance — anything	Lowest
`1`	Leader wrote it to its local log	Leader crashes before a follower replicates → the message is gone	Low
`all` (-1)	Leader + every in-sync replica (ISR) wrote it	Only if every ISR fails simultaneously — practically never	Higher

acks=all alone is not enough. You also need min.insync.replicas >= 2. With min.insync=1, "all ISR" can mean just the leader if every follower has fallen behind — and you're back to acks=1 guarantees without realizing it. See section 9.

For a payment service: acks=all + min.insync.replicas=2 + enable.idempotence=true. You'll pay 1–2ms more latency, but you'll never lose a transaction record on a single broker failure.

For metrics / clickstream where one in a million dropped events is fine, acks=1 is reasonable. acks=0 is for "we'd rather drop the data than slow the producer" — IoT telemetry at the edge, that kind of thing.

acks is durability vs latency. Default to all for anything financial; 1 for analytics; 0 only when you'd genuinely rather lose the data than block.

Consumer Groups & Partition Assignment — How Kafka Scales Reads

Consumer groups are how Kafka turns a fan-out log into parallel processing. The rule is simple: each partition is read by exactly one consumer in a group. That single rule gives you both scale and ordering.

Maya runs an order-processing app. Topic orders has 6 partitions. She deploys 3 consumer instances, all in group order-processor. Each instance gets 2 partitions. Black Friday hits — she scales to 6 instances; each gets 1 partition. She tries 12 instances — 6 get 1 partition each, the other 6 sit idle. Why?

How does Kafka distribute partitions among consumers in a group?

A consumer group is a set of consumers that share a group.id.
For each topic the group subscribes to, every partition is assigned to exactly one consumer in the group.
If numConsumers < numPartitions: some consumers get multiple partitions.
If numConsumers == numPartitions: 1 partition each — maximum parallelism.
If numConsumers > numPartitions: extras sit idle. Partitions are the upper bound on consumer parallelism.

If asked "how do I scale my consumer?", the answer is: add partitions first, then add consumers. Adding consumers without enough partitions is wasted infra.

Two consumer groups can read the same topic independently — each group has its own offset. This is how you fan-out: deploy one group for the order service, another for analytics, another for fraud detection. None of them step on each other.

Group = team. Each partition is read by one teammate. Add partitions to add parallelism. Add groups to add use-cases.

Offset Commits — Auto vs Manual, and the Duplicate-Processing Bug

Offset management is where most consumer bugs live. The default of enable.auto.commit=true sounds convenient. It is also responsible for 9 out of 10 "we processed the order twice" production incidents.

Aman's order processor receives an order, charges the card, and crashes before the auto-commit timer fires. On restart, the consumer re-reads the same record — and charges the card a second time. The customer is now billed twice for one purchase. Aman learns about enable.auto.commit the hard way.

What's the difference between auto-commit and manual commit? Which gives you at-least-once vs at-most-once?

Auto-commit (enable.auto.commit=true): every auto.commit.interval.ms (default 5s), the consumer commits the latest polled offset — regardless of whether you finished processing those records. If you crash mid-batch, the offset has already moved past records you didn't actually process. You get at-most-once with potential data loss, or duplicates on the previous batch.

Manual commit with commitSync() or commitAsync() after processing: you commit only after the work is done. If you crash, you re-read and re-process. You get at-least-once with possible duplicates — the standard pattern.

At-least-once consumer pattern

props.put("enable.auto.commit", "false");
try (KafkaConsumer<String, String> c = new KafkaConsumer<>(props)) {
  c.subscribe(List.of("orders"));
  while (running) {
    ConsumerRecords<String, String> recs = c.poll(Duration.ofMillis(200));
    for (var r : recs) processOrder(r);  // idempotent!
    c.commitSync();                    // commit AFTER processing
  }
}

Always pair manual commit with idempotent processing. Duplicates are inevitable in at-least-once; idempotency is what makes them harmless. (See the Kafka & Payment Idempotency deep dive.)

Auto-commit = data loss waiting to happen. Manual commit + idempotent processing = the production-grade pattern. There is no "exactly-once consumer" without transactions (section 11).

Rebalancing — Why Your Consumer Pauses Every Deploy

Rebalancing is the protocol that redistributes partitions when group membership changes — a consumer joins, leaves, or fails. It's also the source of the dreaded "stop-the-world" pause every team eventually hits.

What triggers a rebalance, and why is it expensive? What's cooperative-sticky?

Triggers: a consumer joins (deploy), leaves (shutdown), times out (session.timeout.ms with no heartbeat), or topic metadata changes (partition count grew).

Eager (default before 2.4): all consumers stop polling, give up all partitions, group coordinator reassigns, all consumers resume. Total stop-the-world. A 200-consumer group can pause for seconds.

Cooperative-sticky (default 3.0+): only the partitions that need to move are revoked; everyone else keeps polling. Massively reduces pause duration. Set partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor.

A long processOrder() that exceeds max.poll.interval.ms (default 5min) will get the consumer kicked out of the group — triggering a rebalance even though nothing else changed. Fix: shrink your batch (max.poll.records) or do work asynchronously and call poll() on schedule.

Rebalancing is correctness, not a bug — but eager rebalancing is painful. Switch to cooperative-sticky on Kafka 2.4+ and watch deploy pauses drop from seconds to milliseconds.

Brokers, Replication & ISR — How Kafka Doesn't Lose Data

Replication is what makes Kafka durable, and ISR (in-sync replicas) is the concept that makes replication useful. Mess this up in an interview and the rest of the conversation gets harder.

What's the difference between replication factor and ISR? What happens when the leader dies?

Replication factor = how many copies of each partition exist across brokers. RF=3 means 3 copies on 3 different brokers.

Leader / Followers — for each partition, exactly one replica is leader. All reads and writes go to the leader. Followers pull from the leader to stay in sync.

ISR (In-Sync Replicas) — the subset of replicas that are caught up with the leader within replica.lag.time.max.ms (default 30s). A follower that falls behind is removed from ISR. The leader itself is always in ISR.

Leader failure: controller picks a new leader from the current ISR. If unclean.leader.election.enable=false (default since 2.5), and ISR is empty, the partition goes unavailable rather than electing a non-ISR replica that may have stale data.

RF=3 is "the document is on 3 photocopiers". ISR is "the photocopiers that have today's edition, not last week's". The leader can only be picked from the up-to-date copies.

Replication = "how many copies". ISR = "which copies are fresh". Leader election picks from ISR. Unclean leader election trades availability for durability — the modern default is "no, fail closed".

min.insync.replicas — The Durability Triangle

This one config is the hidden third leg of the durability story. Without it, acks=all is a lie.

If I set acks=all on the producer, do I have full durability?

Not quite. acks=all means "wait for every replica currently in ISR". If two of your three replicas have fallen behind and dropped out of ISR, then "all ISR" = just the leader. You acked from one box. If that box dies before any follower catches up, you lose data.

min.insync.replicas=2 on the topic forces the producer to fail with NotEnoughReplicasException rather than silently downgrading durability. The producer can retry; meanwhile no record is acked unless at least 2 replicas have it.

The durability triangle

# Topic config
replication.factor=3
min.insync.replicas=2

# Producer config
acks=all
enable.idempotence=true   # de-dup retries (section 10)

RF=3 + min.insync=3 means a single broker outage stops writes to that topic — you've sacrificed availability for durability. RF=3 + min.insync=2 is the standard balance: tolerate 1 broker down, still write safely.

acks=all alone is necessary but not sufficient. The full incantation is RF=3 + min.insync.replicas=2 + acks=all + enable.idempotence=true. That's the production setting for "do not lose this data".

Idempotent Producer — What `enable.idempotence=true` Actually Does

Without idempotence, a network blip can turn one logical send into multiple identical records on the broker. Idempotence is what makes producer retries safe.

Karthik's producer sends "order-42 charged ₹500". The broker writes it, the ack reply gets dropped by a flaky switch. The producer times out, retries. Broker writes "order-42 charged ₹500" again. Now Karthik's downstream consumer charges twice. enable.idempotence=true is the one-line fix.

What problem does the idempotent producer solve? How does it work under the hood?

Without idempotence, retries can produce duplicates: the broker received the original send, replied with an ack that got lost, the producer retried, the broker wrote the message a second time.

Idempotent producer adds three pieces:

Producer ID (PID) — assigned by the broker on connect. Stable for the producer's lifetime.
Sequence number per (PID, partition) — monotonically increasing.
Broker-side de-dup window — broker tracks the highest sequence per PID per partition; rejects out-of-order or duplicate sequences.

Result: even if the producer retries the same batch 5 times, the broker writes it once.

Idempotence is FREE in the sense it's per-producer-session: enable it. But it does NOT survive a producer restart — a new PID is issued, sequence resets. For end-to-end exactly-once across producer restarts, you need transactions (section 11).

enable.idempotence=true → producer retries are safe within a session. The broker tracks (PID, seq) and dedupes. This is the floor of "exactly-once produce".

Transactions & Exactly-Once Semantics — The Whole Truth

"Exactly-once" is the most-asked, most-misunderstood Kafka feature. It exists, it works, but it works in a specific way — and it requires the consumer to participate.

Does Kafka actually support exactly-once? What's a transactional producer?

Yes — for the read-process-write loop within Kafka. End-to-end EOS to a non-Kafka sink (a database, an HTTP API) requires your application to make the sink-side write idempotent or transactional.

The transactional producer adds:

transactional.id — a stable identifier across producer restarts. Kafka uses it to fence zombie producers.
initTransactions(), beginTransaction(), send(), sendOffsetsToTransaction(), commitTransaction() — APIs that make a batch atomic across multiple topic-partitions.
Read-process-write atomicity: in one transaction, you read input from topic A, write output to topic B, and commit your input offsets — all atomically. Either everything happens or nothing does.

Consumers must set isolation.level=read_committed to skip records belonging to aborted transactions.

Read-process-write transaction

producer.initTransactions();
while (running) {
  ConsumerRecords<String,String> recs = consumer.poll(...);
  producer.beginTransaction();
  for (var r : recs) {
    producer.send(new ProducerRecord("output", transform(r.value())));
  }
  producer.sendOffsetsToTransaction(offsetsOf(recs), consumer.groupMetadata());
  producer.commitTransaction();   // atomic: writes + offset commit
}

Transactions cost ~30% throughput vs idempotent-only. They are not the default. Use them when end-to-end EOS within Kafka is a hard requirement (e.g., a stream-processing job that must not double-count). Otherwise, idempotent producer + at-least-once consumer + idempotent sink is simpler and faster.

EOS in Kafka = idempotent producer + transactional commit + read_committed consumer. Within Kafka, it's atomic. Outside Kafka (DB, API), the application makes the sink idempotent. There is no magic.

Retention vs Log Compaction — Two Different Cleanup Strategies

Topics get cleaned up in one of two ways. Mixing them up will trip you up in any interview that goes past basics.

What's the difference between time-based retention and log compaction? When would you use each?

Time / Size Retention (default)

Records older than retention.ms (default 7 days) or beyond retention.bytes get deleted in segment-sized chunks.

Use for: events (orders, page views, audit log). Care about every record for a window, then it's garbage.

Log Compaction (`cleanup.policy=compact`)

For every key, keep only the latest record. Older records with the same key get tombstoned and eventually deleted.

Use for: state. "What is user X's profile right now?" The topic is a changelog — last write per key wins.

Compacted topic mental model

user-profiles topic, before compaction:
[user-1, name=Aman]
[user-2, name=Riya]
[user-1, name=Aman Patel]      // update
[user-3, name=Karthik]
[user-1, null]                 // tombstone — delete user-1

After compaction:
[user-2, name=Riya]
[user-3, name=Karthik]
// user-1 fully removed after delete.retention.ms

Kafka Streams' KTable is backed by a compacted topic. Internal offset/group state in __consumer_offsets is also a compacted topic. If you see "compaction" in a Kafka design discussion, think state, not events.

Retention deletes old data. Compaction keeps the latest per key. Events use retention; state uses compaction. You can have both: cleanup.policy=compact,delete.

Partition Key Strategy — How Hot Partitions Are Born

The choice of partition key is the single biggest determinant of whether your Kafka cluster scales gracefully or melts down at peak.

Maya picks customer_country as the partition key for her order topic. 80% of customers are in India. So 80% of records land on one partition. That partition's broker is at 95% CPU; the others are idle. Her consumer for that partition is 20 minutes behind. Diagnosis: hot partition.

What's a hot partition? How do you avoid it?

Hot partition = one partition receiving disproportionate traffic because the partition key has a skewed distribution. Symptoms: one consumer falls behind, one broker is hot, others are cool.

Pick a high-cardinality key. order_id over country. user_id over region.
Salt the key if cardinality is bounded. Append a random 0–N suffix to artificially split a hot key — at the cost of losing per-key ordering for that hot key.
Custom partitioner. If your domain has known hot keys, route them to a sub-partition pool.
Match key to ordering needs. If order matters per order_id, use order_id; don't pick country for "scale" and then complain about ordering.

Adding partitions to an existing topic does not rebalance existing data. Records that landed on partition 5 stay there. Newly-keyed records get hashed across the new partition count. So your hot partition stays hot for the retention window even after you scale.

Hash distribution is only as good as your key's entropy. Pick high-cardinality keys, monitor partition-level lag, and salt when you must.

Message Ordering Guarantees — Per Partition, Never Across

"I need messages in order" is one of the most common requirements — and one of the most commonly mishandled in Kafka.

Does Kafka guarantee ordering? What if my topic has 4 partitions?

Kafka guarantees ordering within a single partition. Records appended to partition P0 are read in append order. Period.

Across partitions there is no global order. Two records in P0 and P1 may be read in either order by a consumer reading both.

So: if you need "events for order-42 in order", make sure they all go to the same partition. The producer does this automatically when you set key=order-42 — every record with that key is hashed to the same partition.

"Per-entity ordering" (per user, per order, per device) maps cleanly to Kafka. "Global ordering" doesn't — and almost no real system actually needs it. If you find yourself wanting it, you probably have one partition and have given up scale, or you need to model a "single-writer per entity" instead.

Order is per partition. Use the entity ID as the key to keep that entity's stream ordered. Across partitions, all bets are off.

`max.in.flight.requests.per.connection` — The Subtle Ordering Trap

A producer config most people set to "5 (default)" and never think about — until it silently breaks per-key ordering on a retry.

If I have retries=3 and max.in.flight.requests.per.connection=5, can my messages get out of order?

Without idempotence: yes. With 5 in-flight batches and retries enabled, batch 1 can fail and be retried, while batches 2–5 already succeeded. Now batch 1's record arrives after batch 2's — out of order.

With enable.idempotence=true: no. The broker uses sequence numbers per (PID, partition) and rejects out-of-order writes; the producer waits and reorders. As of Kafka 3.0, idempotence allows up to 5 in-flight requests safely (older versions required ≤ 5 only without idempotence; 1 with).

If you turn off idempotence and need ordering, you must set max.in.flight.requests.per.connection=1 — at a 5–10x throughput cost. The right answer is "leave idempotence on".

Modern Kafka producer should run with enable.idempotence=true always. It gives you de-dup AND in-order retries with high in-flight concurrency. There's no reason to disable it.

Consumer Lag — What It Is & What Causes Spikes

Consumer lag is THE health metric of any Kafka consumer. The interviewer wants to see if you understand both the math and the operational picture.

Define consumer lag. What are common causes of a sudden lag spike?

Lag = log_end_offset − consumer_offset, per partition. It's how many records the consumer is behind the producer's latest write.

Common spike causes:

Producer burst — Black Friday, viral spike, retried fan-out from upstream.
Slow downstream — DB call inside the consumer suddenly takes 200ms instead of 5ms.
Rebalance — paused all consumers for a few seconds; backlog built up.
Hot partition — one partition's lag spikes while others are fine; key skew (section 13).
Consumer crash / GC pause — JVM full GC stalled poll() for 30s.
Network throttling / quota hit — the broker is shaping your consumer.

Always alert on lag per partition, not total. Total lag of 100K spread evenly across 10 partitions is fine; total lag of 100K all on one partition is a hot-partition incident.

Lag is the universal Kafka SLO. Watch it per partition. Spikes mean either "too much in" or "too slow out" — diagnose the rate of change of lag, not the absolute number.

Backpressure & Tuning — fetch.min.bytes, max.poll.records, Quotas

A handful of consumer configs control how Kafka shapes traffic. Knowing the right knobs differentiates ops experience from textbook knowledge.

My consumer is CPU-bound. What configs would you change?

Config	Default	What it does
`fetch.min.bytes`	1	Wait for at least this many bytes before returning a fetch. Bump to ~50KB to amortize round trips.
`fetch.max.wait.ms`	500	...but don't wait longer than this. Tune with fetch.min.bytes.
`max.poll.records`	500	Max records returned per poll. Drop to 100 if processing is heavy and rebalance pauses bite.
`max.partition.fetch.bytes`	1MB	Per-partition response cap. Raise for large messages.
`max.poll.interval.ms`	5min	Max time between polls before group coordinator marks consumer dead. Bump for slow batches.
`session.timeout.ms`	10s	Heartbeat-based death detection; pair with `heartbeat.interval.ms`.

Broker-level quotas (producer_byte_rate, consumer_byte_rate) cap a client by user/clientID. Use them to prevent one bad team from saturating your shared cluster.

Bigger fetches + fewer poll loops = more throughput. Smaller poll batches + longer poll intervals = saner operations. Tune both with the actual workload, not defaults.

Schema Registry & Avro — Schema Evolution Without Breaking Consumers

Once you have more than two services on a topic, schema becomes a contract. Schema Registry + Avro / Protobuf is how that contract is versioned and enforced.

Why do you need a schema registry? What does "backward compatible" mean?

Without schemas, every consumer parses raw bytes and prays the producer didn't change anything. Add a field, drop a field, rename — every downstream consumer breaks silently. Schema Registry stores versioned schemas keyed by topic-subject; producers serialize with the registered schema ID embedded in the message; consumers fetch the schema by ID to deserialize.

Compatibility modes (set per subject):

BACKWARD (default) — new schema can read data written by the previous schema. Producers can upgrade; consumers stay on old schema.
FORWARD — old schema can read data written by the new schema. Consumers can upgrade; producers stay on old.
FULL — both directions. Adding optional fields with defaults is the safe lane.
NONE — anything goes. Don't.

For most "I added an optional field with a default" changes, BACKWARD compatibility is fine. The hard cases — required field added, field removed, type changed — should be done as a new topic + dual-write migration, not an in-place breaking change.

Schema Registry turns "best of luck to your consumers" into a contract-driven system. Avro/Protobuf give compact binary encoding; the registry adds versioning & compatibility checks at registration time.

Kafka Streams vs Plain Consumer — When to Reach for Streams

Both read records from topics. Streams adds stateful operations, windowed joins, and exactly-once semantics — at the cost of state-store complexity.

When would you use Kafka Streams instead of a plain KafkaConsumer?

Use a plain consumer when you're doing stateless work: read a record, transform it, write to a DB or call an API. No joins, no aggregations across records.

Use Kafka Streams when:

You need to join two streams (orders ⨝ payments).
You need windowed aggregations (5-minute rolling sum of clicks).
You need state and want it backed by Kafka (changelog topic + RocksDB local store).
You want exactly-once processing across multi-step pipelines for free.

Streams primitives: KStream (event stream), KTable (changelog / current value per key), GlobalKTable (broadcast lookup table). State stores back KTables and aggregations; they're persisted as compacted topics so a restarted node can rebuild state.

Plain consumer = ETL. Streams = stateful, windowed, joined, exactly-once. Don't use Streams for what a plain consumer does — the operational overhead isn't free.

Kafka Connect & Dead Letter Queues — Moving Data Without Writing Code

Connect is the official "ingest from / egest to external systems" framework. DLQs are the operational must-have nobody mentions in the docs.

What's Kafka Connect? How do you handle a poison-pill message?

Kafka Connect runs as a separate cluster of workers that host plug-in connectors. Source connectors pull from MySQL/Postgres/MongoDB/S3 into Kafka; sink connectors push from Kafka to Snowflake/Elasticsearch/JDBC. You configure them with JSON over a REST API; no code.

A poison pill is a record the connector cannot process — bad schema, parse error, FK violation. Without DLQ config, the connector retries forever and stalls the entire partition.

Configure errors.tolerance=all + errors.deadletterqueue.topic.name=my-connector-dlq and bad records get routed to a separate topic for manual triage; the main pipeline keeps moving.

For non-Connect consumers, build your own DLQ pattern: try-catch the processing, on failure publish the record (plus error metadata) to a DLQ topic, commit the offset. Periodically replay the DLQ once you've fixed the bug.

Connect = no-code ingest/egest. DLQ = the safety valve that stops one bad record from halting the entire pipeline. Always configure both.

ZooKeeper vs KRaft — Why Kafka Dropped ZK

A dated interview cliché is "Kafka uses ZooKeeper". As of Kafka 3.3+, that's no longer the recommended setup. KRaft (Kafka Raft) is in.

Why did Kafka move away from ZooKeeper? What's KRaft?

ZooKeeper held controller election state and topic metadata. Operating two distributed systems (Kafka + ZK) was painful: separate config, separate JVM tuning, separate failure modes, slow metadata propagation.

KRaft bakes the same metadata-replication and controller-election protocol into Kafka itself, using the Raft consensus algorithm. A small set of broker nodes act as controllers (analogous to ZK ensemble); they replicate the metadata log among themselves; clients no longer need any external coordinator.

Benefits: faster controller failover, faster cluster start-up, simpler operations, supports far more partitions per cluster (millions vs ~200K).

As of 4.0 (released 2025), ZooKeeper mode is removed entirely. New clusters are KRaft-only. If asked "do you still run ZK?", the senior answer is "no — and we migrated using the official 3.x rolling-upgrade path".

ZooKeeper was the historical metadata layer. KRaft replaces it natively. Modern Kafka is one binary, one log, one cluster.

Security — SASL, SSL, ACLs

For any production deployment, "who can do what" is mandatory. Three pieces fit together.

How do you secure a Kafka cluster?

SSL/TLS — encrypts data in flight between clients and brokers, and between brokers themselves. Set security.protocol=SSL or SASL_SSL.
SASL — authenticates clients. Common mechanisms: PLAIN (username/password over TLS), SCRAM-SHA-256/512 (challenge-response), GSSAPI (Kerberos), OAUTHBEARER.
ACLs — authorize. Per principal (user) per resource (topic, group, cluster) per operation (Read, Write, Describe, Create). Stored in Kafka itself (KRaft) or ZK (legacy).

ACL example

# Allow user "order-svc" to produce to topic "orders"
kafka-acls --bootstrap-server kafka:9092 \
  --add --allow-principal User:order-svc \
  --operation Write --topic orders

# Allow user "analytics" to consume any topic in group "analytics-group"
kafka-acls --add --allow-principal User:analytics \
  --operation Read --topic '*' --group analytics-group

SSL encrypts. SASL authenticates. ACLs authorize. All three together — anything less is a leak waiting to happen.

Kafka vs RabbitMQ vs SQS — When to Pick What

A classic "compare and contrast" question. Don't just list features — pick a scenario and reason about it.

When would you pick RabbitMQ or SQS over Kafka?

Scenario	Pick	Why
High-throughput event streaming, multiple consumer groups, replay needed	Kafka	Append-only log, retained on disk, multi-consumer-group, millions msg/sec
Task queue with priority & complex routing	RabbitMQ	Exchange types (direct/topic/fanout/headers), priority queues, per-message TTL
Simple managed queue, AWS-native, no ops overhead	SQS	Zero ops, pay-per-message, FIFO mode for ordering
Event sourcing / CDC / stream processing	Kafka	Log compaction, Streams, Connect, schema registry
Low-volume, reliable per-message acks, dead-lettering by default	RabbitMQ / SQS	Per-message ack/nack with built-in DLQ. Kafka can do it but with offset gymnastics
Cross-region replay, audit log, compliance "show me yesterday's traffic"	Kafka	Long retention + replay is its native model

Kafka = append-only log + replay. RabbitMQ = task queue + flexible routing. SQS = managed simple queue. Pick by use case, not by familiarity.

Production Pitfalls — What Real Operators Trip Over

A grab-bag of issues that turn up in incident reports across every Kafka shop.

auto.offset.reset surprises — set to latest by default. New consumer groups skip history. Set to earliest if you want backfill.
Slow consumer triggers rebalance — work that exceeds max.poll.interval.ms kicks the consumer out. Either shrink the work, or do it async and call poll on schedule.
Topic with too many partitions — every partition is a file handle on every broker. 200K+ partitions/cluster gets painful (KRaft is better than ZK here).
Compacted topic + tombstone never cleaned — delete.retention.ms default 24h. Set min.cleanable.dirty.ratio properly or compaction never runs.
JVM heap too big — Kafka brokers want 6–8GB heap, NOT 32GB. The OS page cache does the heavy lifting; a huge heap just lengthens GC pauses.
Mixed message sizes — one 10MB message in a stream of 1KB messages will block the partition consumer for seconds. Use a separate topic for large blobs, or store in S3 and pass the URL.
Insufficient disk on broker — Kafka grows linearly with retention × throughput. log.retention.bytes caps it; without it, the broker fills the disk and goes read-only.
Cross-region producers without compression — every byte crosses an expensive WAN. compression.type=zstd or lz4 is mandatory.

Most Kafka outages aren't "Kafka is broken". They're config defaults applied to workloads they don't fit. Audit the eight items above before going to prod.

Tricky Gotchas & Rapid-Fire Q&A

A grab-bag of the questions that come at the end of a Kafka round, usually paired with a smile and "let's see if you really know this stuff".

Why is Kafka so fast despite writing to disk?

Sequential writes (no random seeks), zero-copy via sendfile() from page cache straight to socket buffer (no userspace copy), batching at every layer, and reliance on OS page cache rather than a custom buffer pool. Disk is only "slow" for random I/O — Kafka does pure append.

Can a consumer read from a specific offset?

Yes. consumer.seek(partition, offset) after assignment. Or seekToBeginning() / seekToEnd(). This is how you implement "replay last hour" without changing groups.

What happens if I delete a topic that's still being written to?

With delete.topic.enable=true (default), the topic is marked for deletion and removed asynchronously by the controller. In-flight producer sends fail with UnknownTopicOrPartitionException. Always drain producers first.

How does a consumer know which partitions it owns?

After joining the group, the group coordinator broadcasts the assignment via SyncGroupResponse. The consumer then calls onPartitionsAssigned and starts polling those partitions. On rebalance, it gets onPartitionsRevoked first, then a new assignment.

Can two producers write to the same partition simultaneously?

Yes — the leader broker serializes them. Each producer's request is appended in arrival order. Within a single producer, ordering is preserved by sequence numbers (idempotence). Across producers, "first write wins" is at-the-broker, not coordinated.

What's the difference between poll() returning empty and a consumer being assigned no partitions?

Empty poll = no new records on assigned partitions in poll(timeout). No partitions = the group has more consumers than partitions; coordinator gave you nothing. Check consumer.assignment() to distinguish.

If I have a 6-partition topic and a 12-instance Kafka Streams app, what happens?

6 instances each get 1 partition. The other 6 are standby replicas for state stores (if num.standby.replicas>0) or fully idle. They take over on failure but contribute no processing capacity. Same partition cap as plain consumers.

Is it safe to have transactions and idempotence enabled together?

Transactions require idempotence — setting transactional.id implicitly enables enable.idempotence=true. The two work together; the question is whether you also need transactional atomicity beyond just retry-safety.

What's a "zombie" producer and how does Kafka fence it?

A zombie is a producer instance you thought was dead but is actually still running (e.g., GC'd long enough to be considered dead, then woke up). With transactional.id, the broker tracks an epoch per ID — when a new instance starts with the same transactional.id, it bumps the epoch and any send from the old (lower-epoch) producer is rejected with ProducerFenced. That's how exactly-once survives instance failures.

Can offsets go backward?

No within a partition — offsets are monotonically increasing and immutable once written. But a consumer's committed offset for a group can be moved backward via seek + commitSync, which is how "replay" works.

The senior-level Kafka answer is always: explain the underlying mechanism (sequence numbers, ISR, page cache, sendfile), then map it to the operational behavior. Memorizing configs alone gets you to mid-level; tying configs to mechanics gets you the offer.

What Kafka Actually Is — Distributed Append-Only Log, Not a Message Queue

Topics, Partitions & Offsets — The Three-Level Address

The Producer — What Actually Happens When You Call send()

acks=0, 1, all — The Durability Knob Most Devs Get Wrong

Consumer Groups & Partition Assignment — How Kafka Scales Reads

Offset Commits — Auto vs Manual, and the Duplicate-Processing Bug

Rebalancing — Why Your Consumer Pauses Every Deploy

Brokers, Replication & ISR — How Kafka Doesn't Lose Data

min.insync.replicas — The Durability Triangle

Idempotent Producer — What enable.idempotence=true Actually Does

Transactions & Exactly-Once Semantics — The Whole Truth

Retention vs Log Compaction — Two Different Cleanup Strategies

Time / Size Retention (default)

Log Compaction (cleanup.policy=compact)

Partition Key Strategy — How Hot Partitions Are Born

Message Ordering Guarantees — Per Partition, Never Across

max.in.flight.requests.per.connection — The Subtle Ordering Trap

Consumer Lag — What It Is & What Causes Spikes

Backpressure & Tuning — fetch.min.bytes, max.poll.records, Quotas

Schema Registry & Avro — Schema Evolution Without Breaking Consumers

Kafka Streams vs Plain Consumer — When to Reach for Streams

Kafka Connect & Dead Letter Queues — Moving Data Without Writing Code

ZooKeeper vs KRaft — Why Kafka Dropped ZK

Security — SASL, SSL, ACLs

Kafka vs RabbitMQ vs SQS — When to Pick What

Production Pitfalls — What Real Operators Trip Over

Tricky Gotchas & Rapid-Fire Q&A

The Producer — What Actually Happens When You Call `send()`

Idempotent Producer — What `enable.idempotence=true` Actually Does

Log Compaction (`cleanup.policy=compact`)

`max.in.flight.requests.per.connection` — The Subtle Ordering Trap