A Practical Guide to Kafka Integration with Spring Boot
Apache Kafka is an open-source distributed event streaming platform that is widely used for building scalable, high-throughput, and real-time data pipelines. It is often integrated with Spring Boot applications to enable messaging between microservices or other components. In this post, we will discuss Kafka Integration with Spring Boot application, including all the components and configuration involved.
Setting Up a Kafka Cluster
Before diving into the Spring Boot implementation, it is necessary to set up a Kafka cluster. A Kafka cluster typically consists of several brokers that work together to manage the storage and replication of topic partitions. The cluster can be set up on-premises or on cloud platforms like Amazon Web Services (AWS) or Google Cloud Platform (GCP). For simplicity, we will use a single-node Kafka cluster running locally.
To set up a Kafka cluster locally, follow these steps:
1. Download Kafka from the official website: https://kafka.apache.org/downloads.
2. Extract the downloaded archive to a directory on your local machine.
3. Navigate to the Kafka directory and start the ZooKeeper server using the following command
bin/zookeeper-server-start.sh config/zookeeper.properties
4. Start the Kafka server using the following command:
bin/kafka-server-start.sh config/server.properties
5. Create a topic using the following command:
bin/kafka-topics.sh --create --topic my-topic --bootstrap-serverlocalhost:9092
This will create a topic named my-topic
with a single partition on the Kafka cluster running locally.
Setting up Kafka cluster with Docker Compose
- Install Docker and Docker Compose on your local machine if you haven’t already done so.
- Create a new directory for your Kafka project and navigate into it.
- Create a new file called
docker-compose.yml
in your project directory and add the following code:
yamlCopy code<code>version: '2'
services:
zookeeper:
image: zookeeper:3.4.9
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:2.13-2.7.0
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
depends_on:
- zookeeper
</code>
This code defines a Docker Compose file that includes two services: a Zookeeper service and a Kafka service. The Zookeeper service will run the official Zookeeper image and expose port 2181, while the Kafka service will run the Wurstmeister Kafka image and expose port 9092. The Kafka service also sets two environment variables to configure the Kafka broker to use the Zookeeper service for coordination.
- Run
docker-compose up -d
to start the Kafka cluster in detached mode. - You should now be able to connect to the Kafka cluster using
localhost:9092
.
You can read more about Docker Compose here: https://docs.docker.com/compose/
Kafka Components
Kafka consists of several components that work together to enable messaging between producers and consumers. These components include topics, partitions, producers, consumers, and brokers.
Topics
A topic is a category or feed name to which messages are published by producers. A topic can have one or more partitions, and each partition is ordered and can be replicated across multiple brokers.
Partitions
A partition is a unit of parallelism in Kafka. Each partition is an ordered, immutable sequence of messages that is continuously appended to. A topic can have one or more partitions, and each partition can be replicated across multiple brokers for fault tolerance.
Producers
A producer is a component that publishes messages to a Kafka topic. A producer can publish messages synchronously or asynchronously, and can also specify a partition key to determine which partition a message should be sent to.
Consumers
A consumer is a component that subscribes to one or more Kafka topics and consumes messages from them. A consumer can consume messages from one or more partitions in a topic, and can also specify a consumer group to enable load balancing and fault tolerance.
Brokers
A broker is a Kafka server that manages the storage and replication of topic partitions. A Kafka cluster typically consists of multiple brokers that work together to manage the entire cluster.
Spring Boot Configuration for Kafka
Now that we have a basic understanding of the Kafka components, we can move on to the Spring Boot implementation. To use Kafka in a Spring Boot application, we need to include the spring-kafka
dependency in our pom.xml
file:
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>${spring-kafka.version}</version>
</dependency>
Configuration Properties
Spring Boot provides a convenient way to configure Kafka properties using the application.properties
file or the application.yml
file. We can configure properties such as the Kafka broker address, the consumer group ID, the serializer and deserializer classes, and many more.
Here are some of the most commonly used Kafka properties:
spring.kafka.bootstrap-servers=localhost:9092
spring.kafka.consumer.group-id=my-group
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
These properties can be accessed in our Spring Boot application using the @Value
annotation or the @ConfigurationProperties
annotation.
KafkaTemplate
The KafkaTemplate
is a class provided by the Spring Kafka library that simplifies the process of sending messages to a Kafka topic. It provides a higher-level abstraction over the raw Producer
API and allows us to send messages asynchronously or synchronously.
@Autowired
private KafkaTemplate<String, String> kafkaTemplate;
public void sendMessage(String message) {
kafkaTemplate.send("my-topic", message);
}
KafkaListener
The @KafkaListener
annotation is used to create a consumer that listens to a Kafka topic and consumes messages from it. It is applied to a method that takes a single parameter of the type ConsumerRecord
.
@KafkaListener(topics = "my-topic", groupId = "my-group")
public void receiveMessage(ConsumerRecord<String, String> record) {
System.out.println("Received message: " + record.value());
}
Kafka Streams
Kafka Streams is a library provided by Kafka that allows us to process and analyze streams of data in real-time. It provides a high-level DSL for building stream processing applications.
To use Kafka Streams in a Spring Boot application, we need to include the spring-kafka-streams
dependency in our pom.xml
file:
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka-streams</artifactId>
<version>${spring-kafka.version}</version>
</dependency>
We can then create a Kafka Streams topology using the StreamsBuilder
class:
@Configuration
@EnableKafkaStreams
public class KafkaStreamsConfig {
@Bean
public StreamsBuilderFactoryBean streamsBuilder() {
Map<String, Object> config = new HashMap<>();
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(StreamsConfig.APPLICATION_ID_CONFIG, "my-streams-app");
StreamsBuilder builder = new StreamsBuilder();
builder.stream("my-input-topic")
.mapValues(value -> value.toUpperCase())
.to("my-output-topic");
StreamsBuilderFactoryBean streams = new StreamsBuilderFactoryBean(builder.build(), config);
return streams;
}
}
This example creates a Kafka Streams topology that reads from an input topic named my-input-topic
, applies a transformation to each message (in this case, converting it to uppercase), and writes the transformed messages to an output topic named my-output-topic
.
In conclusion, Kafka is a powerful and popular messaging platform that is widely used in the industry. It is often integrated with Spring Boot applications to enable messaging between microservices or other components. In this essay, we discussed how to implement Kafka in a Spring Boot application, including all the components and configuration involved. We also provided examples of how to use the KafkaTemplate
, @KafkaListener
, and Kafka Streams in a Spring Boot application. By following the steps and examples provided, developers can easily implement Kafka in their Spring Boot applications and start building scalable and high-throughput data pipelines.
In the last post we had discussed about Automate GitHub Repository Setup: How to Set PAT, Username, and Email for multiple GitHub accounts. You can read that post here.
Happy Learning!
Subscribe to the BitsToGigs Newsletter