In standard APIs, if the server is busy, the user waits. With Kafka, the server "publishes" a message to a topic and moves on. A separate "consumer" processes that message whenever it’s ready. This project demonstrates this Pub-Sub (Publisher-Subscriber) model.
Apache Kafka is more than just a "messaging queue." It is a distributed event streaming platform designed to handle trillions of events a day. Think of it as a giant, unbreakable digital ledger where data is written as it happens and can be read by anyone authorized to see it.
At its simplest level, Kafka consists of three main players:
- Producers: Applications that send (write) data to Kafka.
- Brokers (The Cluster): The servers that store the data. Kafka usually runs as a cluster of multiple brokers for safety.
- Consumers: Applications that read (process) data from Kafka.
This is where the "magic" happens. Kafka doesn't just throw all data into one big bucket.
A Topic is like a folder in a filesystem. If you are building an e-commerce app, you might have a topic for orders, one for payments, and one for shipping.
A Topic is split into Partitions. This is the key to Kafka's massive scale.
- Parallelism: Different consumers can read different partitions at the same time.
- Order: Within a single partition, messages are strictly ordered by time (First-In, First-Out).
- Immutability: Once a message is written to a partition, it can never be changed.
Every message in a partition gets an ID number called an Offset. Consumers use this number to "bookmark" where they left off reading.
Unlike a traditional database that uses complex "B-Trees" to store data, Kafka uses a Distributed Commit Log.
- Append Only: When a Producer sends a message, Kafka simply sticks it at the very end of the partition file. This is incredibly fast because the disk head doesn't have to move around to find a spot.
- Sequential I/O: Reading and writing sequentially is orders of magnitude faster than random access.
- Zero-Copy: Kafka uses a trick in the Linux kernel called
sendfileto move data directly from the disk to the network card without the CPU having to touch the data. This is why Kafka is so fast.
Kafka is designed to stay alive even if a server (Broker) catches fire.
- Leader and Followers: Each partition has one Leader broker and several Followers.
- Writes: All writes go to the Leader.
- Replication: The Followers automatically copy the data from the Leader.
- Failover: If the Leader broker dies, one of the Followers is instantly "promoted" to be the new Leader.
If you have millions of messages, one consumer app might be too slow. You can put multiple consumers into a Consumer Group.
- Kafka will automatically assign different partitions to different consumers in the group.
- If one consumer crashes, Kafka re-assigns its partitions to the remaining healthy consumers. This is called Rebalancing.
- Decoupling: Your "Order Service" doesn't need to know that the "Email Service" exists. It just sends a message to Kafka, and the Email Service picks it up whenever it's ready.
- Buffering: If your database is slow, Kafka can hold onto the messages (for days or weeks) until the database catches up.
- Real-time Processing: You can analyze data as it flies through Kafka (e.g., detecting credit card fraud in milliseconds).
The Producer uses KafkaTemplate<String, String> to send data to a Topic. Think of a Topic like a category or a folder in a filing cabinet.
- Key Point: The Producer doesn't care who is listening or if the listener is even online. It just delivers to the Kafka Broker.
The Consumer uses the @KafkaListener annotation.
- Topic: It "subscribes" to
my_topic. - Group ID:
gfg-group. Kafka uses Group IDs to manage which consumer gets which message, allowing you to scale up by adding more consumers to the same group.
Since data travels over the network, it must be converted:
- Serializer: Converts Java Strings into bytes (Producer side).
- Deserializer: Converts bytes back into Java Strings (Consumer side).
We configured these in the
application.ymlto keep our Java code clean.
Provides a REST endpoint (/api/kafka/publish) so we can trigger a message push using a simple URL parameter. This acts as the entry point for our event-driven flow.
Instead of writing 50 lines of Java configuration (as seen in your commented-out KafkaConfig files), we use Spring Boot's Auto-Configuration. By defining bootstrap-server: localhost:9092, Spring automatically builds the KafkaTemplate for us.
| Component | Detailed Responsibility & Implementation |
|---|---|
| application.yaml | Producer: Sends messages to Kafka topics. Serializers convert Java objects → bytes before sending. Here, both key and value are plain strings. Consumer: Reads messages from Kafka topics. Deserializers convert bytes → Java objects. group-id ensures multiple consumers can share work. auto-offset-reset=earliest → consumer starts from the beginning if no offset exists. bootstrap-server: Entry point to Kafka cluster. In production, you’d list multiple brokers for fault tolerance. |
| KafkaProducerService | KafkaTemplate: Spring Boot abstraction for sending messages to Kafka. Handles serialization (String → bytes) and broker communication. Topic: A logical channel in Kafka where producers publish and consumers subscribe. Here, "my_topic" is the topic name. Logging with @Slf4j: Provides a logger (log.info) to track message flow. Useful for debugging and monitoring. Producer Flow: sendMessage("Hello Kafka") → logs message → sends to Kafka broker → stored in topic "my_topic". |
| KafkaConsumerService | @KafkaListener: Makes the method a Kafka consumer. Automatically triggered when new messages arrive in the topic. Topic ("my_topic"): Logical channel where producers publish and consumers subscribe. Consumer Group ("gfg-group"): Ensures multiple consumers can share work. Kafka distributes messages among consumers in the same group. Logging with @Slf4j: Provides a logger (log.info) to track incoming messages. Useful for debugging and monitoring. |
| KafkaController | @RestController: Makes this a REST API controller → methods return JSON/text directly. @RequestMapping("/api/kafka"): Sets base path → all endpoints start with /api/kafka. @PostMapping("/publish"): Defines POST endpoint for publishing messages. @RequestParam("message"): Extracts message from request query string. Service Delegation: Controller calls KafkaProducerService.sendMessage() → keeps controller lightweight. |
| SpringbootKafkaExampleApplication | @SpringBootApplication: Marks this as the main Spring Boot app → auto-configures everything. SpringApplication.run(): Starts the application → loads beans, controllers, services, and Kafka configs. Integration: KafkaController → exposes REST endpoint /api/kafka/publish; KafkaProducerService → sends messages; KafkaConsumerService → listens and processes messages. |
| KafkaProducerConfig | ProducerFactory: Creates Kafka producer instances with given configuration. KafkaTemplate: Simplifies sending messages to Kafka topics. Serializers: Convert Java objects → bytes before sending to Kafka. Manual Config vs application.yml: Old way → define beans manually in config class. New way → just configure in application.yml and Spring Boot auto-configures. |
| KafkaConsumerConfig | ConsumerFactory: Creates Kafka consumer instances with given configuration. ConcurrentKafkaListenerContainerFactory: Manages listener containers for @KafkaListener methods. @EnableKafka: Enables Kafka listener support in Spring Boot. Deserializers: Convert Kafka message bytes → Java objects (here, Strings). Manual Config vs application.yml: Old way → define beans manually in config class. New way → just configure in application.yml and Spring Boot auto-configures. |
This is a classic Spring Boot + Apache Kafka integration example representing a basic Producer-Consumer pattern.
Here is a complete breakdown of the application flow using architectural and sequence diagrams based on the code provided.
This diagram shows the main components and how data moves between them at a system level.
- REST Client: The external trigger (like Postman or a browser) that sends an HTTP POST request.
- KafkaController: The entry point into the Spring Boot app. It exposes the REST endpoint.
- KafkaProducerService: The service layer responsible for business logic related to sending messages. It holds the topic name definition (
my_topic). - KafkaTemplate: A Spring-provided helper class that abstracts away the low-level details of the Apache Kafka Producer client (serialization, connection management, sending asynchronously).
- Apache Kafka Broker: The external messaging middleware running on
localhost:9092. It receives messages, stores them in themy_topictopic, and serves them to consumers. - Spring MessageListenerContainer (Internal): You don't see this class in your code, but Spring Boot creates it automatically because it sees the
@KafkaListenerannotation. It runs in the background, constantly polling the Kafka broker for new messages. - KafkaConsumerService: Your service containing the specific method decorated with
@KafkaListenerthat executes only when a message is successfully read from Kafka.
This diagram shows the precise order of method calls and the asynchronous nature of the communication.
Before any messages flow, the Spring Boot application starts up.
- **Reading
application.yaml**: Spring Boot detects thespring.kafka.*properties. - Auto-Configuration:
- Because it sees
bootstrap-server, it knows how to connect to Kafka. - It creates a default
ProducerFactoryand theKafkaTemplatebean, configuring it to use string serializers as defined in the YAML. - It creates a default
ConsumerFactoryand aConcurrentKafkaListenerContainerFactory.
- Annotation Scanning:
- Spring finds the
@KafkaListenerannotation inKafkaConsumerService. - It uses the container factory to create a background message listener container that immediately starts trying to connect to
localhost:9092, subscribe tomy_topicas part of groupgfg-group, and poll for messages.
Note on commented code: The KafkaProducerConfig and KafkaConsumerConfig classes you commented out are indeed not needed because Spring Boot's auto-configuration handles all that boilerplate code based on the settings in application.yaml.
- A client (e.g., a browser or Postman) sends a POST request:
http://localhost:8080/api/kafka/publish?message=HelloWorld. - The
KafkaControllerreceives this request. It extracts "HelloWorld" from the query parameter. - The controller calls
kafkaProducerService.sendMessage("HelloWorld"). - The
KafkaProducerServicetakes the message and passes it to the injectedKafkaTemplate, specifying the hardcoded topic"my_topic". - The
KafkaTemplatehandles the heavy lifting: it converts the Java String "HelloWorld" into raw bytes using the configuredStringSerializerand sends it over the network to the Kafka Broker. - Crucial: The call to Kafka is usually asynchronous. The Java thread doesn't wait for Kafka to acknowledge storage to disk before moving on.
- The controller immediately returns the string "Message send to kafka successfully!" to the client with an HTTP 200 status. The HTTP interaction finishes here.
This happens asynchronously, completely separate from the HTTP request thread.
- The background Spring MessageListenerContainer (which has been polling Kafka constantly) notices a new message in
my_topic. - It retrieves the raw bytes from Kafka.
- It uses the configured
StringDeserializerto convert bytes back into the Java String "HelloWorld". - It identifies that the method
KafkaConsumerService.consume()is decorated with@KafkaListenerfor this topic. - It invokes that method, passing "HelloWorld" as the argument.
- Inside
consume(), thelog.info(...)line runs, printing the message to your console. - Once the method exits successfully, Spring tells Kafka that the message has been processed successfully (commits the offset) so it isn't read again by this consumer group.
Since you have Kafka 4.1.1, you are using the most modern version of Kafka. This version has completely removed the need for Zookeeper. It uses KRaft (Kafka Raft), which is faster, more secure, and easier to manage.
Here is the exact setup and usage guide for Kafka 4.x on Windows using the Command Prompt as Administrator.
Kafka 4.x requires you to "format" your storage with a unique Cluster ID before you can start the server. This replaces the old Zookeeper registration.
- Open Command Prompt as Administrator.
- Navigate to your Kafka folder:
cd C:\kafka_2.13-4.1.1
- Generate a Cluster ID:
.\bin\windows\kafka-storage.bat random-uuid
Copy the long string it produces (e.g., 764c12b6-abcd-1234...).
4. Format the Storage Directory:
Replace YOUR_CLUSTER_ID with the string you just copied.
.\bin\windows\kafka-storage.bat format --standalone -t YOUR_CLUSTER_ID -c .\config\server.properties
The --standalone flag is crucial for Kafka 4.x to tell it you are running a single-node cluster.
In Kafka 4.x, you only need one window to run the entire server.
- Start the Broker/Server:
.\bin\windows\kafka-server-start.bat .\config\server.properties
Keep this window open. When you see [KafkaRaftServer nodeId=1] Kafka Server started, your environment is ready.
Open new Command Prompt windows for each of these steps.
| Step | Command (Run from C:\kafka_2.13-4.1.1) |
|---|---|
| 1. Create Topic | .\bin\windows\kafka-topics.bat --create --topic my-first-topic --bootstrap-server localhost:9092 |
| 2. List Topics | .\bin\windows\kafka-topics.bat --list --bootstrap-server localhost:9092 |
| 3. Start Producer | .\bin\windows\kafka-console-producer.bat --topic my-first-topic --bootstrap-server localhost:9092 |
| 4. Start Consumer | .\bin\windows\kafka-console-consumer.bat --topic my-first-topic --from-beginning --bootstrap-server localhost:9092 |
- KRaft Mode: The Broker now acts as its own "Controller." It manages its own metadata (like partition leaders and topic configurations) without needing a separate Zookeeper service.
- Log Storage: In your
server.properties, thelog.dirssetting (defaultC:\tmp\kraft-combined-logs) is where the actual data resides. - Metadata Log: Kafka stores its internal state in a special topic called
__cluster_metadata. This is how it remembers your topics even after a restart.
- NoSuchFileException: Ensure you are not trying to use a
config\kraftsubfolder. In 4.1.1, theserver.propertiesis usually directly inside theconfigfolder. - The system cannot find the path specified: This usually means you are in the wrong directory or there is a typo in the path. Always run from the root
C:\kafka_2.13-4.1.1. - Reconfiguration failed Error: You might see a log error regarding
Log4j2. This is a known Windows bug in 4.x. You can ignore it as long as the server says "Started."
You must have Apache Kafka and Zookeeper running on your local machine (usually at localhost:9092).
Start SpringbootKafkaExampleApplication. You will see logs indicating that the Kafka Consumer has successfully joined the group gfg-group.
Open Postman or your browser and hit:
POST http://localhost:8080/api/kafka/publish?message=HelloKafka
Check your IDE console. You will see:
INFO: Sending message to Kafka : HelloKafka(From Producer)INFO: Successfully received message from Kafka : HelloKafka(From Consumer)


