Apache Kafka — what is it?
If you’ve spent any time at all reading tech blogs or listening to podcasts discussing the latest tech stacks etc., you’ve surely heard of Apache Kafka. Most likely in the context of it being a "streaming platform", a "distributed log" or "message queue" — and yes this may be true in one way or the other, but there’s a lot more to it. So we'd like to give you an introduction to just what Apache Kafka actually is, its use cases, and why we’re sure it’s a technology that's here to stay.
It's all about communication: Kafka acts as a message broker
When it comes to larger software systems, services need a way to transfer information between each other. Sounds trivial, but if you’d like to exchange messages using certain systems, you'll need a so-called ‘message broker’ enabling you to do just that.
This is where Kafka comes. It’s the central piece of infrastructure that handles all of those messages you’d like to send and receive from your various applications.
Basic concepts of Kafka messaging
As shown above, Kafka can act as a message broker — that sends and receives messages between different applications and gathers them in one central place. You can input your messages into it, your applications will register with Kafka and get those delivered to which you are interested in.
We call the components that are generating new messages "producers" and those waiting for messages to arrive "consumers".
In the most simple form, data flows just like you see it in the infographic below:
Kafka topics: structure your data flow
As depicted in our first infographic, we can see that, in fact, several applications can interact in various ways with a single Apache Kafka broker. To structure the data flow, and differentiate between message types, Kafka introduces the concept of topics.
You can think of a topic as a First-In-First-Out queue. A producer may only append new messages to a topic, whereas a consumer subscribing to the topic will receive those messages.
A noteworthy difference to other message queue systems is that Kafka won't delete the messages from the topic once consumed. You can define if and when you want messages to be deleted for each topic.
Setting up multiple producers & consumers
You might also have noticed that topics can have multiple producers and consumers attached to them. By doing this, you can for example set up a global "log-entries" topic and let all of your applications push their logs into this topic.
On the other hand, you can deploy a consumer application that handles the messages and updates the database for your system’s log dashboard.
A second consumer application, consuming the same data, can perform abnormality detection to ensure it will detect problems early on.
Learn from your data
Now that your log information has been permanently stored in the "log-entries '' topic, you have new possibilities to analyze it later on.
Let's say you had an incident that happened that caused an outage. After recovering the system, you now have time to analyze all the log information and identify the signals that should have warned you before things actually got messy.
This blog post is a part of our Apache Kafka series. That's it for this time. Next post we'll dig a bit deeper into how Apache Kafka handles your messages, how it can provide you with high availability & scalability. We'll discuss some more fundamental concepts and get a better understanding of Kafka’s most important internals.
Get in touch with us
Did you find this post helpful, or maybe even have some constructive feedback? Get in touch and let’s continue the discussion.
Share this insight on
The importance of data accuracy
Data is one of the most important topics people are talking about in business right now — but why is it important? And why is it not just about gathering any type of data? Find out here.
An Intro to Enterprise Application Integration
Read about why Enterprise Application Integration is important, what advantages and challenges come with it, and just how you can go about it.
How to automatically create podcast artwork
How I used Transistor, Placid and n8n to create podcast artwork for each episode. Yes, I am a lazy podcaster.
How to test a Hasura Api with Jest
APIs with Hasura are easy to setup, and they are as easy to test with a minimal setup
Develop Electron in Docker
A story on why and how to develop Electron in Docker
TRIGO becomes a Red Hat Advanced Partner
TRIGO named Red Hat Advanced Solution Partner in Austria.