Search courses 👉
Professional Course

Big Data Hadoop and Spark Developer - eLearning

Length
25 hours
Price
450 USD
Next course start
Start when you want, at your own pace! See details
Delivery
Self-paced Online
Length
25 hours
Price
450 USD
Next course start
Start when you want, at your own pace! See details
Delivery
Self-paced Online
This provider usually responds within 48 hours 👍

Course description

Big Data Hadoop and Spark Developer - eLearning

Learn how to analyse large volumes of data

The world is getting increasingly digital and the importance of big data and data analytics will continue to grow in the coming years. Choosing a career in the field of big data and analytics might just be what you have been trying to find to meet your career expectations.

The Big Data Hadoop training course will teach you the concepts of the Hadoop framework, its formation in a cluster environment, and prepares you for Cloudera's CCA175 Big Data certification which is not included with this learning.

ABOUT THE COURSE

With this Big Data Hadoop course, you will learn the big data framework using Hadoop and Spark, including HDFS, YARN, and MapReduce. The course will also cover Pig, Hive, and Impala to process and analyse large datasets stored in the HDFS and use Sqoop and Flume for data ingestion.

You will be shown real-time data processing using Spark, including functional programming in Spark, implementing Spark applications, understanding parallel processing in Spark, and using Spark RDD optimisation techniques. You will also learn the various interactive algorithms in Spark and use Spark SQL for creating, transforming, and querying data forms.

Finally, you will be required to execute real-life, industry-based projects using CloudLab in the domains of banking, telecommunication, social media, insurance, and e-commerce. 

DURATIONApproximately 25 hours.

PRE-REQUISITES

There are no prerequisites for this course. However, it's beneficial to have some knowledge of Core Java and SQL. We offer a complimentary self-paced online course "Java essentials for Hadoop" if you need to brush up your Core Java skills.

LEARNING OBJECTIVES

By the end of the course you will be able to understand:

  • The different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
  • Hadoop Distributed File System (HDFS) and YARN architecture
  • MapReduce and its characteristics and assimilate advanced MapReduce concepts
  • Different types of file formats, Avro schema, using Avro with Hive, and Sqoop and Schema evolution
  • Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
  • HBase, its architecture and data storage, and learn the difference between HBase and RDBMS
  • Resilient distribution datasets (RDD) in detail
  • The common use cases of Spark and various interactive algorithms

You will also be able to:

  • Ingest data using Sqoop and Flume
  • Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
  • Gain a working knowledge of Pig and its components
  • Do functional programming in Spark, and implement and build Spark applications
  • Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimisation techniques
  • Create, transform and query data frames with Spark SQL

WHAT IS INCLUDED?

  • 12 months online access to the Big data hadoop and spark developer e-learning
  • Approximately 25 hours duration
  • Five hands-on projects to perfect the skills learnt
  • Two simulation test papers for self-assessment
  • 16 lessons
  • Free course included - Apache Kafka
  • Free course included - Core Java
  • 5 real-life industry projects

WHAT'S COVERED?

The course covers the following topics:

Course introduction

  • Lesson 1 - Introduction to big data and Hadoop ecosystem
  • Lesson 2 - HDFS and YARN
  • Lesson 3 - MapReduce and Sqoop
  • Lesson 4 - Basics of Hive and Impala
  • Lesson 5 - Working with Hive and Impala
  • Lesson 6 - Types of data formats
  • Lesson 7 - Advanced Hive concept and data file partitioning
  • Lesson 8 - Apache Flume and HBase
  • Lesson 9 - Pig
  • Lesson 10 - Basics of Apache Spark
  • Lesson 11 - RDDs in Spark
  • Lesson 12 - Implementation of Spark applications
  • Lesson 13 - Spark parallel processing
  • Lesson 14 - Spark RDD optimisation techniques
  • Lesson 15 - Spark algorithm
  • Lesson 16 - Spark SQL

FREE COURSE - Apache Kafka

FREE COURSE - Core Java

The training course also includes five real-life, industry-based projects. Successful evaluation of one of the first two projects below is a part of the certification eligibility criteria. We have also included three additional projects for practice, to help you start your Hadoop and Spark journey.

Project 1

Domain: Banking – a Portuguese banking institution ran a marketing campaign to convince potential customers to invest in a bank term deposit. Their marketing campaigns were conducted through phone calls and some customers were contacted more than once. Your job is to analyse the data collected from the marketing campaign.

Project 2

Domain: Telecommunication – a mobile phone service provider has launched a new Open Network campaign. The company has invited users to raise complaints about the towers in their locality if they face issues with their mobile network. The company has collected the dataset of users who raised a complaint. The fourth and the fifth field of the dataset has a latitude and longitude of users, which is important information for the company. You must find this latitude and longitude information on the basis of the available dataset and create three clusters of users with a k-means algorithm.

Project 3

Domain: Social media – as part of a recruiting exercise, a major social media company asked candidates to analyse a dataset from Stack Exchange. You will be using the dataset to arrive at certain key insights.

Project 4

Domain: Website providing movie-related information – IMDB is an online database of movie-related information. IMDB users rate movies on a scale of 1 to 5 -- 1 being the worst and 5 being the best -- and provide reviews. The dataset also has additional information, such as the release year of the movie. You are tasked to analyse the data collected.

Project 5

Domain: Insurance – a US-based insurance provider has decided to launch a new medical insurance program targeting various customers. To help a customer understand the market better, you must perform a series of data analyses using Hadoop.

TARGET AUDIENCE

Big data career opportunities are on the rise and Hadoop is quickly becoming a must-know technology in big data architecture. Big Data training is suitable for IT, data management, and analytics professionals, including:

  • Software developers and architects
  • Analytics professionals
  • Senior IT professionals
  • Testing and mainframe professionals
  • Data management professionals
  • Business intelligence professionals
  • Project managers
  • Aspiring data scientists
  • Graduates looking to build a career in big data analytics

EXAM INFORMATION

To obtain a course completion certificate you must complete 85% of the course, one project and one simulation test, with a minimum score of 80%.

The formal examination CCA175 - Spark and Hadoop certificate is not available as part of this package but the learning does help prepare you for his exam.

Upcoming start dates

1 start date available

Start when you want, at your own pace!

  • Self-paced Online
  • Online
  • English

Contact this provider

Contact course provider

Fill out your details to find out more about Big Data Hadoop and Spark Developer - eLearning.

  Contact the provider

  Get more information

  Register your interest

Country *

reCAPTCHA logo This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Adding Value Consulting AB
1743 S Sidewinder Dr
84060 Park City UT

Adding Value Consulting (AVC)

Reimagining Education: The Story Behind AVC The traditional education model has been around for centuries, but as I worked within it, I realized something was missing: flexibility, innovation, and accessibility. Students and professionals alike were struggling to balance education with...

Read more and show all training delivered by this supplier

Ads