Course description
Kafka Fundamentals is a comprehensive course designed to provide you with a deep understanding of Apache Kafka, one of the most popular platforms for building real-time data pipelines and streaming applications. This course is ideal for developers, data engineers, and system architects who want to learn how to design, build, and manage Kafka-based solutions effectively.
The course begins with an exploration of Kafka’s architecture, where you’ll learn how to plan and design your own distributed queue. You’ll address key questions related to message format, consumption patterns, data persistence, and retention, and understand how to support multiple producers and consumers efficiently.
Next, you’ll dive into Kafka topics, console producers, and console consumers, learning how to create topics with multiple partitions, ensure data replication, and manage message order and data skew. You’ll also gain practical experience in optimizing message writing and reading for different use cases, including low latency and maximum compression scenarios.
The course then covers working with Kafka using various programming languages, including Java, Scala, and Python. You’ll build simple consumers and producers, manage consumer groups, and handle transactions. This module also explores integration with Web UI and REST APIs.
A dedicated module on AVRO and Schema Registry follows, where you’ll learn how to add and manage AVRO schemas, build AVRO consumers and producers, and use Schema Registry to ensure data consistency. You’ll also learn how to handle errors using specific error topics.
In the SpringBoot and SpringCloud module, you’ll learn how to integrate Kafka with Spring applications. You’ll write templates for Spring Apps, add Kafka Templates for producers and consumers, and modify Spring Boot to work in asynchronous mode. The course also covers streaming pipelines, where you’ll compare Kafka Streams, KSQL, Kafka Connect, Akka Streams, Spark Streaming, and Flink. You’ll learn how to build robust streaming pipelines and manage checkpoints, backpressure, and executor management.
Finally, the course concludes with Kafka monitoring, where you’ll learn how to build and manage Kafka metrics using tools like Grafana, ensuring your Kafka deployments are optimized and well-monitored.
Learning Outcomes: By the end of this course, participants will:
- Understand Kafka’s architecture and design distributed queues with optimal performance.
- Efficiently manage Kafka topics, producers, and consumers, ensuring data consistency and performance.
- Integrate Kafka with Java, Scala, Python, and other languages via REST, and handle transactions effectively.
- Implement AVRO schemas and use Schema Registry to manage data serialization and deserialization.
- Build and optimize streaming pipelines using Kafka Streams, KSQL, and other streaming frameworks.
- Monitor Kafka clusters effectively using Grafana and other monitoring tools.
Upcoming start dates
Who should attend?
Prerequisites
Development experience in Java (over 6 months)
Training content
1. Module 1: Kafka Architecture: theory 2h / practice 1.5h
- Planning your own distributed queue in pairs: write, read, keep data in parallel mode.
- What's the format and average size of messages?
- Can messages be repeatedly consumed?
- Are messages consumed in the same order they were produced?
- Does data need to be persisted?
- What is data retention?
- How many producers and consumers are we going to support?
2. Module 2: Kafka-topics, console-consumer, console-producer: theory 2h / practice 1.5h
- Using internal Kafka-topics, console-consumer, console-producer
- Create topic with 3 partitions & RF = 2
- Send message, check the ISR
- Organize message writing/reading with order message keeping
- Organize message writing/reading without order message keeping and hash partitioning
- Organize message writing/reading without skew data
- Read messages from the start, end and offset
- Read topic with 2 partitions / 2 consumers in one consumer group (and different consumer group)
- Choose optimal number of consumers for reading topic with 4 partitions
- Write messages with min latency
- Write messages with max compression
3. Module 3: Web UI + Java, Scala, Python API + other languages (via Rest): theory 2h / practice 1.5h
- build simple consumer and producer
- add one more consumer to consumer group
- write consumer which reads 3 records from 1st partition
- add writing to another topic
- add transaction
4. Module 4: AVRO + Schema Registry: theory 2h / practice 1.5h
- Add avro schema
- compile java class
- build avro consumer and producer with a specific record
- add schema registry
- add error topic with error topic and schema registry
- build avro consumer and producer with a generic record
5. Module 5: SpringBoot + SpringCloud: theory 2h / practice 1.5h
Homework:
- Write template for Spring App
- Add Kafka Template with producer
- Add Kafka Template with consumer
- Add rest controller
- Modify spring boot to work in async (parallel) mode
6. Module 6: Streaming Pipelines (Kafka Streams + KSQL + Kafka Connect vs Akka Streams vs Spark Streaming vs Flink), theory 2h / practice 1.5h
Homework:
- Choose the way to read data from a Kafka topic with 50 partitions
- Try to use the checkpoint mechanism
- Start the five executors and kill some of them
- Check the backpressure
7. Module 7: Kafka Monitoring, theory 2h / practice 1.5h
Homework:
Build several metrics in Grafana
Total: theory 14h (58%) / practice 10h (42%)
Certification / Credits
Objectives
Upon completion of the "Kafka Fundamentals" course, trainees will be able to:
- Design and implement distributed queues using Kafka, with a focus on message format, order, and persistence.
- Create and manage Kafka topics, partitions, and replicas, ensuring optimal performance and reliability.
- Develop and integrate Kafka consumers and producers in Java, Scala, Python, and through REST APIs.
- Use AVRO and Schema Registry to manage data serialization and ensure compatibility across services.
- Build and manage robust streaming pipelines with Kafka Streams, KSQL, and other streaming frameworks.
- Monitor Kafka clusters, set up metrics, and optimize Kafka performance using tools like Grafana.
Quick stats about Luxoft Training Center?
More than 200 training courses
Conducted over 1,500 training sessions
Customized training solutions for business
Contact this provider
Luxoft Training Center
Luxoft Training Center — an essential part of the global technology leader, Luxoft, a DXC Technology Company. We play a pivotal role in propelling B2B businesses forward by delivering customized training solutions. Emphasizing the significance of learning and employee development,...