This repository was archived by the owner on Nov 20, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 46
Kafka Crash Course
Lev Gorodinski edited this page Dec 11, 2017
·
2 revisions
Kafka is a messaging system, where messages belong to partitions within topics, and are stored by brokers. Topics are organized into partitions, where each partition contains a sequence of messages ordered by a numeric offset as follows:
Topic A
| p1 | p2 | ... | pN |
|---|---|---|---|
| 1 | 1 | 1 | |
| 2 | 2 | 2 | |
| 3 | 3 |
Topic B
| p1 | p2 | ... | pM |
|---|---|---|---|
| 1 | 1 | 1 | |
| 2 | 2 | ||
| 3 |
Partitions are allocated to brokers with a configurable replication factor:
| Broker | Partitions |
|---|---|
| 1 | (Topic A,p1), (Topic A,p3), ... |
| 2 | (Topic B,p2), (Topic A,p2), ... |
| 3 | (Topic B,p1), (Topic B,p3), ... |
- Broker: a Kafka server node.
-
Cluster: a collection of
brokerswhich operate in unison to provide redundancy and load-balancing for a set oftopics. -
Leader: a selected
broker, which receives producedmessagesand returns fetchedmessagesfor apartitionof atopic. -
Replica: a
brokerwhich contains a copy ofmessagesin apartition. Theleaderis a replica, but there may also be non-leader replicas. -
ISR: in-sync replicas - a set of
replicaswhich are in-sync with each other. - Client: a Kafka client which communicates with brokers.
-
Message: a unit of messaging, belonging to a particular
partitionwithin atopic. Messages are stored bybrokers. - MessageSet: a collection of contiguous messages. Messages are produced and consumed in messages sets as an optimization.
-
Topic: a named collection of
messages, allocated across a number ofpartitions. -
Offset: a numeric position of a
messagewithin apartitionof atopic. -
Partition: a sequence of
messagesordered byoffset. -
Consumer Group: a group of
clientinstances consuming atopicas coordinated by agroup coordinator. -
Group Coordinator: a
brokerdesignated as the coordinator for aconsumer group. -
Retention Policy: a policy configurable by
topicwhich defines the maximum age ofmessageswithin atopic, and the maximum size of apartitionwithin atopic.Messagesbeyond the threshold are removed. -
Producer: a
clientwhich writesmessagesto partitions of a topic by communicating with the leading broker for that topic/partition combination. -
Consumer: a
clientwhich fetchesmessagesfrompartitionsof atopic.