Course description
Data Engineering Basics for Everyone
Welcome to Data Engineering Basics. This course is designed to familiarize you with data engineering concepts, ecosystem, lifecycle, processes, and tools.
The Data Engineering Ecosystem includes several different components. It includes data, data repositories, data integration platforms, data pipelines, different types of languages, and BI and Reporting tools. Data pipelines gather raw data from disparate data sources. Data repositories, such as relational and non-relational databases, data warehouses, data marts, data lakes, and big data stores, store and process this data. Data Integration Platforms combine data into a unified view for secure and easy access by data consumers. Data consumers use BI, reporting, and analytical tools on data so they can glean insights for better decision-making. You will learn about each of these components in this course.
A typical Data Engineering lifecycle includes architecting data platforms and designing data stores. It also includes the process of gathering, importing, wrangling, cleaning, querying, and analyzing data. Systems and workflows need to be monitored and finetuned for performance at optimal levels. In this course, you will learn about the architecture of data platforms and things you need to consider in order to design and select the right data store for your needs. You will also learn about the processes and tools a data engineer employs in order to gather, import, wrangle, clean, query, and analyze data.
Upcoming start dates
Who should attend?
Prerequisites
- Basic computer skills.
- General familiarity of how computing works.
- Grounding in IT systems.
- Comfortable working in either Linux, Unix, Windows, or MacOs.
Training content
- Module 1: What is Data Engineering
- Module 2: Data Engineering Ecosystem
- Module 3: Data Engineering Lifecycle
- Module 4: Career Opportunities and Learning Paths
Course delivery details
This course is offered through IBM, a partner institute of EdX.
9–10 hours per week
Costs
- Verified Track -$99
- Audit Track - Free
Certification / Credits
What you'll learn
The objective of this course is to give you a solid understanding of what Data Engineering is.
In this course you will learn about:
Module 1: What is Data Engineering ****
- Modern Data Ecosystem
- Key Players in the Data Ecosystem
- What is Data Engineering?
- Responsibilities and Skillsets of a Data Engineer
- A day in the life of a Data Engineer
Module 2: Data Engineering Ecosystem ****
- Overview of the Data Engineering Ecosystem
- Types of Data
- Understanding different types of File Formats
- Sources of Data
- Languages for Data Professionals
- Overview of Data Repositories
- RDBMS
- NoSQL
- Data Warehouses, Data Marts, and Data Lakes
- ETL, ELT, and Data Pipelines
- Data Integration Platforms
- Foundations of Big Data
- Big Data processing tools: Hadoop, HDFS, Hive, and Spark
Module 3: Data Engineering Lifecycle
- Architecting the Data Platform
- Factors for Selecting and Designing Data Stores
- Security
- How to Gather and Import Data
- Data Wrangling
- Tools for Data Wrangling
- Querying and Analyzing data
- Performance Tuning and Troubleshooting
- Governance and Compliance
Module 4: Career Opportunities and Learning Paths
- Career Opportunities in Data Engineering
- Data Engineering Learning Path
Contact this provider
edX
edX For Business helps leading companies upskill their labor forces by making the world’s greatest educational resources available to learners across a wide variety of in-demand fields. edX For Business delivers high-quality corporate eLearning to train and engage your employees...