Tuesday, February 27, 2018

Message Consumption in Apache kafka


  • Critical concept to understand because it is how consumers can read messages at their own pace and process them independently. 
  • Place holder, It is like a bookmark that maintains the last read position
  • In the case of kafka topic, it is the last read message.
  • The offset is entirely established and maintained by the consumer.Since the consumer is entirely responsible for reading the messages and processing them on its own.
  • Keep track of what it has read and has not read
  • Offset refers to a message identifier


  1.  When a consumer wishes to read from a topic, it must establish a connection with a Broker
  2. After establishing the connection, the consumer will decide what messages it wants to consume
  3. If the consumer has not previously read from the topic , or it has to start over, it will issue a statement to read from the beginning of the topic (Consumer establishing that its message offset for the topic is 0)   
     0 1 2 3 4
       4.  As it reads through the sequence of messages, it will inevitably come to the last message in                  the topic and move it's offset accordingly
       5. If another consumer is interested in the message from the topic,it could have already read the               messages from the beginning and is simply waiting for more messages to arrive so it can read             and process them.
                 Note: It knows where it left off and can choose to advance from the position, stay put or                                 go back and reread another previously read message, all without the producer,                                       brokers or other consumers needing to know or care


  •   When new messages arrive, the connected consumer will receive an event indicating there is a new message and it can advance its position one and it retrieves the new message


  •  When the last message in the topic is read and processed, the consumer can set its offset, and at that point is caught up.

  •  The time it can retain messages is configurable and is known as the message retention policy
  • All messages are retained by a Kafka cluster regardless if a single consumer has consumed a message. 
  • The length of time in which messages are retained is configurable in hours.Default retention period is 168 hours or 7 days.Beyond that message would start to fall off

Note: Retention period is set for a per topic basis, which means that within a cluster, we could have hundreds of retention policy
Ability to retain message is corresponded to the available storage.


  1. https://kafka.apache.org/documentation.html#kafka_mq - Check the last para.

    I believe you are missing 3 more partitions in your picture. In your case only one consumer will be used.

  2. This article is really helpful for me. I am regular visitor to this blog. Share such kind of article more in future. Personally i like this article a lot and you can have a look at my services also: I was seriously search for a Salesforce training institutes in ameerpet which offer job assistance and Salesforce training institutes in Hyderabad who are providing certification material. It's worth to join Salesforce training institutes in India because of their real time projects material and 24x7 support from customer desk. You can easily find the best Salesforce training institutes in kukatpally kphb which are also a part of Pega training institutes in hyderabad. This is amazing to join Data science training institutes in ameerpet who are quire popular with Selenium training institutes in ameerpet and trending coureses like Java training institutes in ameerpet and data science related programming coures python training institutes in ameerpet If you want HCM course then this workday training institutes in ameerpet is best for you to get job on workday.