Web Snippets

Tuesday, February 27, 2018

Message Consumption in Apache kafka

MESSAGE OFFSET

Critical concept to understand because it is how consumers can read messages at their own pace and process them independently.
Place holder, It is like a bookmark that maintains the last read position
In the case of kafka topic, it is the last read message.
The offset is entirely established and maintained by the consumer.Since the consumer is entirely responsible for reading the messages and processing them on its own.
Keep track of what it has read and has not read
Offset refers to a message identifier

STEPS INVOLVED

When a consumer wishes to read from a topic, it must establish a connection with a Broker
After establishing the connection, the consumer will decide what messages it wants to consume
If the consumer has not previously read from the topic , or it has to start over, it will issue a statement to read from the beginning of the topic (Consumer establishing that its message offset for the topic is 0)

APACHE KAFKA DISTRIBUTED ARCHITECTURE

At the heart of Apache kafka we have a cluster, which consists of hundreds of independent Brokers.
Closely associated with the kafka cluster, we have a Zookeeper environment,which provides the Brokers within a cluster, the metadata it needs to operate at scale and reliability.As this metadata is constantly changing, connectivity and chatter between the cluster members and Zookeeper is required.

CONTROLLER ELECTION

Hierarchy starts with a controller/supervisor
It is a worker node elected by its peers to officiate in the administrative capacity of a controller
The worker node selected as controller is the one that is been around the longest

RESPONSIBILITY OF CONTROLLER ELECTION

Maintain inventory of what workers are available to take on work.
Maintain a list of work items that has been committed to and assigned to workers
Maintain active status of the staff and their progress on assigned tasks

Apache Kafka is a distributed commit log service
Functions much like a publish/subscribe messaging system
Better throughput
Built-in partitioning, replication, and fault tolerance.
Increasingly popular for log collection and stream processing.

Need for stream storage

Decouple producers & consumers
Persistent buffer
Collect multiple streams
Preserve client ordering
Parallel consumption
Streaming Map Reduce

Amazon SQS

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that makes it easy to decouple and scale microservices, distributed systems, and serverless applications.
Building applications from individual components that each perform a discrete function improves scalability and reliability, and is best practice design for modern applications.

After collecting the data we need to store the data in the data store.There are different types of data store.

Types of data store

In memory : Caches, data structure servers
Database : SQL & NoSQL databases
Search : Search engines
File Store : File systems
Queue : Message queues
Stream storage: pub/sub message queues

Web Snippets

Labels

Tuesday, February 27, 2018

Message Consumption in Apache kafka

Apache ZooKeeper

Team formation in Kafka

Overview of kafka

Wednesday, February 21, 2018

Why Stream Storage?

Need for stream storage

Message and Stream Storage

Amazon SQS

Types of data store

Types of data store

Labels

Blog Archive