Course description
Modern Data Management Approaches in Real World Case is a dynamic course designed to equip you with the knowledge and skills necessary to navigate the complexities of modern data management. With a strong focus on real-world applications, this course covers everything from the architecture required to manage massive datasets to the latest technologies used in data storage, processing, and analysis.
The course begins with an in-depth look at a real-world architecture designed to manage the statistics of over 24 million gaming cards. You’ll gain practical insights into the challenges and solutions involved in handling such vast amounts of data, including estimation techniques and best practices for data collection and processing.
Next, you'll explore the evolution of data storage approaches, from traditional databases to modern mass-parallel architectures and hyperconvergence. Understand how these technologies have transformed data management and what the future holds.
The course then delves into specific data management models:
- Relational Model: Discover how replication, sharding, and distributed transactions can solve key data management challenges.
- Document-Oriented Model (MongoDB): Get hands-on experience with MongoDB, exploring its capabilities and understanding its role in managing unstructured and semi-structured data.
- Key-Value Model: Learn about the efficiency of non-relational databases like Cassandra and HBase and understand the necessary conditions for their effective use.
- Distributed File Systems: Dive into cluster architectures like HDFS and explore how SQL can be applied over distributed file systems using Hive, Spark, Spark SQL, and more.
You'll also gain expertise in:
- Message Queues and Streaming Platforms: Understand and practice data stream processing with Spark Streaming.
- Distributed In-Memory Data Storage Systems: Learn about in-memory storage solutions such as Hazelcast, Ignite, and Tarantool.
- Distributed OLAP Systems: Explore distributed OLAP systems with Druid, understanding their architecture and application.
By the end of this course, participants will:
- Develop and implement data management strategies for large-scale data, including real-world cases like the management of 24 million gaming cards.
- Understand the evolution of data storage technologies and their application in modern enterprises.
- Gain practical experience with key data management technologies such as MongoDB, Cassandra, Spark Streaming, and HDFS.
- Apply advanced techniques in data stream processing, distributed transactions, and in-memory data storage.
This course balances theory and practice to ensure a comprehensive understanding of modern data management approaches. You'll engage in 16 hours of learning, with additional hands-on exercises that reinforce the theoretical knowledge, preparing you to tackle real-world data challenges.
Upcoming start dates
Who should attend?
Prerequisites
- Software development/BI /DA / DevOps/ other experience at least 1 year
- Desired requirements: BigData/ BI/ Data Analysts experience 1+ years
Training content
- Real-world architecture with regard to the issue of collecting statistics of more than 24M gaming cards. Estimates. [theory: 1 hour Practice 1 Hour]
- The evolution of approaches to data storage: databases, data storages, database machines, mass-parallel architectures, hyperconvergence [theory: 0.5 hour]
- Relational model: which problems can be solved at the expense of what replication, sharding, distributed transactions [theory: : 0.5 hour]
- Document-oriented model. [MongoDB] [theory: 2.5 hour; practice: 1.5 hour]
- Message queues and streaming platforms. Data stream processing. [Spark Streaming] [theory: 2 hours practice: 2 hours]
- “Key-value” minimal model: key structure options, value structure options, program interfaces. Efficiency of non-relational databases: necessary and sufficient conditions [Cassandra, HBase] [theory: 2 hours practice: 2 hours]
- Distributed file systems: cluster architecture [HDFS]. SQL over distributed file systems: possible architectures, limitations, transactions. [Hive, Spark, Spark SQL, Parquet, ORC] [theory: 2 hours practice: 2 hours]
- Distributed in-memory data storage systems. [Hazelcast, Ignite, Tarantool] [theory: 0.5 hour]
- Distributed OLAP systems. [Druid] [theory: 0.5 hour]
16 hours + (1 hour bonus). Theory - 8,5h (55%), practice 7,5h (45%)
Certification / Credits
Objectives
Upon completion of the "Modern Data Management Approaches in Real World Case" course, trainees will be able to:
- Architect and implement data management solutions for large-scale datasets using modern technologies.
- Analyze and apply different data storage models, including relational, document-oriented, and key-value stores.
- Utilize distributed file systems and streaming platforms to process and analyze data in real time.
- Optimize data management strategies for scalability and performance using in-memory storage and distributed OLAP systems.
Quick stats about Luxoft Training Center?
More than 200 training courses
Conducted over 1,500 training sessions
Customized training solutions for business
Contact this provider
Luxoft Training Center
Luxoft Training Center — an essential part of the global technology leader, Luxoft, a DXC Technology Company. We play a pivotal role in propelling B2B businesses forward by delivering customized training solutions. Emphasizing the significance of learning and employee development,...