If you've spent any time at all reading tech blogs or listening to podcasts discussing the latest tech stacks etc., you've surely heard of Apache Kafka. Most likely in the context of it being a "streaming platform", a "distributed log" or "message queue" — and yes this may be true in one way or the other, but there's a lot more to it. So we'd like to give you an introduction to just what Apache Kafka actually is, its use cases, and why we're sure it's a technology that's here to stay.
When it comes to larger software systems, services need a way to transfer information between each other. Sounds trivial, but if you'd like to exchange messages using certain systems, you'll need a so-called ‘message broker' enabling you to do just that.
This is where Kafka comes. It's the central piece of infrastructure that handles all of those messages you'd like to send and receive from your various applications.
As shown above, Kafka can act as a message broker — that sends and receives messages between different applications and gathers them in one central place. You can input your messages into it, your applications will register with Kafka and get those delivered to which you are interested in.
We call the components that are generating new messages "producers" and those waiting for messages to arrive "consumers".
In the most simple form, data flows just like you see it in the infographic below:
As depicted in our first infographic, we can see that, in fact, several applications can interact in various ways with a single Apache Kafka broker. To structure the data flow, and differentiate between message types, Kafka introduces the concept of topics.
You can think of a topic as a First-In-First-Out queue. A producer may only append new messages to a topic, whereas a consumer subscribing to the topic will receive those messages.
A noteworthy difference to other message queue systems is that Kafka won't delete the messages from the topic once consumed. You can define if and when you want messages to be deleted for each topic.
You might also have noticed that topics can have multiple producers and consumers attached to them. By doing this, you can for example set up a global "log-entries" topic and let all of your applications push their logs into this topic.
On the other hand, you can deploy a consumer application that handles the messages and updates the database for your system's log dashboard.
A second consumer application, consuming the same data, can perform abnormality detection to ensure it will detect problems early on.
Now that your log information has been permanently stored in the "log-entries '' topic, you have new possibilities to analyze it later on.
Let's say you had an incident that happened that caused an outage. After recovering the system, you now have time to analyze all the log information and identify the signals that should have warned you before things actually got messy.
This blog post is a part of our Apache Kafka series. That's it for this time. Next post we'll dig a bit deeper into how Apache Kafka handles your messages, how it can provide you with high availability & scalability. We'll discuss some more fundamental concepts and get a better understanding of Kafka's most important internals.
Did you find this post helpful, or maybe even have some constructive feedback? Get in touch and let's continue the discussion.
Read this insight to dig deeper into Apache Kafka topics and partitions, what they really are and how you can go about implementing them.
We are in the midst of a No-Code/Low-Code boom. It's everywhere, and everyone in tech is talking about it. Read this insight to explore what it is and how you can benefit from it.
no-code, low-code, api
Data is one of the most important topics people are talking about in business right now — but why is it important? And why is it not just about gathering any type of data? Find out here.
Read about why Enterprise Application Integration is important, what advantages and challenges come with it, and just how you can go about it.
How I used Transistor, Placid and n8n to create podcast artwork for each episode. Yes, I am a lazy podcaster.
Want to reduce internal IT support so the development team can focus on other things? Here's how you can do just that with an SSO solution.
sso, keycloak, api