Search courses 👉
Professional Course

Databricks fundamentals

Length
28 hours
Price
700 EUR + tax
Next course start
Start Anytime! See details
Delivery
Self-paced Online
Length
28 hours
Price
700 EUR + tax
Next course start
Start Anytime! See details
Delivery
Self-paced Online
This provider usually responds within 48 hours 👍

Course description

Databricks is an increasingly popular platform for big data processing and analysis. Our Databricks Fundamentals course is a great way to start if you want to improve your skills in this area. You will acquire practical experience with important Databricks tools and ideas over the course of several modules, including writing queries in Scala, Python, and SQL, using Delta Lake / Parquet, and working with Notebooks.

One of the primary goals of the course is to make you more comfortable when using Notebook, the web-based interface for data analysis and collaboration for Databricks. With guidance from our trainer, you’ll learn how to efficiently build, manage, and share notebooks, allowing you to deal with complex data challenges.

Another important topic we will cover is the open-source engine, Spark, that powers Databricks data processing capabilities. You will gain a deep understanding of Spark’s internal architecture, as here we can mention RDD (Resilient Distributed Datasets) which according to databricks.com “is an immutable distributed collection of elements of your data, partitioned across nodes in your cluster, that can be operated in parallel with high level API that offers transformations and actions."

In order to make the right decisions on the project and avoid architectural errors, you’ll discover the differences between Delta Lake and Parquet, two file types used by Databricks to store data. Understanding the particularities of these formats will help you select the best one for your project, leading to more efficient results. We will also cover one of the key topics for any big data environment, which is query writing. You'll learn how to write queries in Scala and SQL, giving you the flexibility to work with different languages and tools as needed.

You will learn how to optimize your Databricks workflows for maximum performance and also learn how to use powerful visualization tools to gain valuable insights - in order to drive better decisions for the project. Overall, the Databricks Fundamentals course is a detailed practical introduction to this big data tool. With guidance from our trainer, who is an experienced Data Engineer, you’ll be able to develop the abilities and confidence to successfully handle the most complex data tasks.

Upcoming start dates

1 start date available

Start Anytime!

  • Self-paced Online
  • Online
  • English

Who should attend?

Prerequisites

Development experience in Scala, Java, Python, & SQL - 3 months.

Training content

Introduction to Databricks – Theory 60% / Practice 40% - 4h

Creating Databricks Service

Databricks RI Overview

Databricks Architecture Overview

Databricks Notebooks

Databricks Cluster and Jobs - Theory 60% / Practice 40% - 4h

Cluster types and configuration

Databricks cluster pool

Databricks Job

Notebooks’ workflows

DBFS - Theory 60% / Practice 40% - 4h

Databricks and Spark - Theory 60% / Practice 40% - 4h

Data Formats

Transformation

Joins, Aggregation

SQL

Delta Lake - Theory 60% / Practice 40% - 4h

Pitfalls of Data Lakes

Data Lakehouse Architecture

Read & Write to Delta Lake

Updates and Deletes on Delta Lake

Merge/Upsert to Delta Lake

History, Time Travel, Vacuum

Delta Lake Transaction Log

Convert from Parquet to Delta

Data Ingestion

Data Transformation - PySpark and Notebooks

Visualizations in Databricks - Theory 60% / Practice 40% - 2h

Collaboration in Databricks - Theory 60% / Practice 40% - 2h

Deploying Databricks on Azure - Theory 60% / Practice 40% - 2h

Deploying Databricks on the AWS Marketplace - Theory 60% / Practice 40% - 2h

Data Protection Use cases - 4h

    Certification / Credits

    Objectives

    • Practice working with Notebook
    • Understand Spark internal structures
    • Ascertain the differences between Delta Lake vs Parquet
    • Write query in Scala, Python, & SQL
    • Learn about optimization in Databricks
    • Explore Data deeply with Databricks

    Quick stats about Luxoft Training Center?

    More than 200 training courses

    Conducted over 1,500 training sessions

    Customized training solutions for business

    Contact this provider

    Contact course provider

    Fill out your details to find out more about Databricks fundamentals.

      Contact the provider

      Get more information

      Register your interest

    Country *

    reCAPTCHA logo This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
    Luxoft Training Center
    Warsaw Spire, plac Europejski 1
    00-844 Warsaw

    Luxoft Training Center

    Luxoft Training Center — an essential part of the global technology leader, Luxoft, a DXC Technology Company. We play a pivotal role in propelling B2B businesses forward by delivering customized training solutions. Emphasizing the significance of learning and employee development,...

    Read more and show all training delivered by this supplier

    Ads