Apache Kafka Interview Questions & Answers
1. What is Kafka?
Kafka is a distributed event streaming platform used for high-throughput, fault-tolerant real-time data
streaming. It acts as a publish-subscribe messaging system.
2. Kafka Core Components
Producer (sends data), Consumer (reads data), Topic (stream of records), Partition (parallelism), Broker
(Kafka server), Zookeeper (cluster manager).
3. What is a Topic?
A topic is a category or feed name to which records are sent by producers.
4. What is a Partition?
A partition allows a topic to be split for parallel processing and scalability. Each partition is an ordered log.
5. What is an Offset?
An offset is a unique identifier of a message within a partition. Consumers use offsets to track their read
progress.
6. How does a Producer send data?
Producers send messages to a topic. They may specify keys to direct messages to specific partitions.
7. What is a Consumer Group?
A group of consumers sharing a topic's data. Kafka assigns one partition per consumer in a group for parallel
consumption.
8. What happens if a consumer fails?
Kafka rebalances and reassigns the partitions to other active consumers in the group.
9. Where is data stored if not consumed?
Apache Kafka Interview Questions & Answers
Kafka stores data for a retention period (e.g., 7 days) even if no consumer reads it.
10. Message Ordering in Kafka
Kafka maintains order only within a partition, not across partitions.
11. Kafka Retention Policy
Messages are retained for a configured time or size limit, regardless of consumption.
12. What is ISR?
In-Sync Replicas (ISR) are replicas that are fully caught up with the leader and eligible for promotion.
13. Exactly-once Delivery
Achieved using idempotent producers and transactions, ensuring a message is not duplicated.
14. ZooKeeper Role
Kafka (up to v2.x) uses ZooKeeper for metadata and leader election. Newer versions can run without it.
15. High Availability
Use replication, distribute brokers, and leverage consumer groups for fault tolerance.
16. Message Delivery Guarantees
acks=0 (no ack), acks=1 (leader ack), acks=all (leader + all replicas ack).
17. Kafka Streams
A Java library for real-time stream processing using Kafka topics.
18. Writing to a Partition
Provide a key; Kafka hashes the key to determine the partition.
Apache Kafka Interview Questions & Answers
19. Kafka Monitoring Tools
Kafka Manager, Prometheus + Grafana, Confluent Control Center, JMX metrics.
20. Broker Failure Handling
Kafka promotes another ISR replica as leader. If not replicated, data may be lost.