Search courses 👉
Professional Course

HW HDP DS - Hortonworks HDP Analyst Data Science (SCI-221)

Sunset Learning Institute, In Reston (+6 locations)
Length
3 days
Length
3 days
This provider usually responds within 48 hours 👍

Course description

HW HDP DS - Hortonworks HDP Analyst Data Science (SCI-221)

This course provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, I Python, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn), the Natural Language Toolkit (NLTK), and Spark MLlib.  

Do you work at this company and want to update this page?

Is there out-of-date information about your company or courses published here? Fill out this form to get in touch with us.

Who should attend?

Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop.

Training content

Day 1: An Introduction to Data Science, Python, Hadoop and Machine Learning

OBJECTIVES

  • Define Data Science and Explain What a Data Scientist Does
  • Differentiate Between Different Types of Data Roles
  • List a Number of Data Science Use Cases
  • Present an Overview of Python
  • Describe the Components of the Big Data Scientific Stack

LABS

  • Using I Python
  • Data Analysis with Python
  • Using HDFS Commands
  • Introduction to Spark REPLs and Zeppelin
  • Using Apache Mahout for Machine Learning

Day 2: Working with Spark RDDs, Data Frames and SparkSQL, Visualization in Zeppelin

OBJECTIVES

  • Explain What an RDD Is
  • Explain How RDDs are Partitioned
  • Create Manipulate and Restore RDDs
  • Use Spark SQL to Create Tables
  • Create an Application and Submit to the Cluster

LABS

  • Create and Manipulate RDDs
  • Create and Save Data Frames
  • Build and Submit Spark Applications

Day 3: Machine Learning Algorithms, Natural Language Processing, and Spark MLlib

OBJECTIVES

  • Describe Common Machine Learning Applications
  • List the Pros and Cons of Various Algorithms
  • Explain what Natural Language Processing is
  • Explain the Feature Engineering Capabilities of Spark MLlib

LABS

  • Use the Python Natural Language Toolkit (NLTK)
  • Classify text using Naïve Bayes
  • Compute K-nearest neighbors
  • Creating a Spam Classifier with MLlib
  • Sentiment Analysis with Spark MLlib

Quick stats about SLI?

Award-winning instructors

Over 50 training locations across North America

All dates are guaranteed-to-run

Contact this provider

Contact course provider

Before we redirect you to this supplier's website, do you mind filling out this form so that we can stay in touch? You can unsubscribe at any time.
If you want us to recommend other suitable courses, please fill out all fields below and check the box beside "Please recommend similar options"
Country *

reCAPTCHA logo This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Sunset Learning Institute

Sunset Learning Institute (SLI) has been an innovative leader in developing and delivering authorized technical training since 1996. We develop and deliver scalable technical learning solutions to our technology partners and their customers to help optimize technology investments, improve job...

Read more and show all training delivered by this supplier

Ads