Course description
HW HDP DS - Hortonworks HDP Analyst Data Science (SCI-221)
This course provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, I Python, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn), the Natural Language Toolkit (NLTK), and Spark MLlib.
Do you work at this company and want to update this page?
Is there out-of-date information about your company or courses published here? Fill out this form to get in touch with us.
Who should attend?
Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop.
Training content
Day 1: An Introduction to Data Science, Python, Hadoop and Machine Learning
OBJECTIVES
- Define Data Science and Explain What a Data Scientist Does
- Differentiate Between Different Types of Data Roles
- List a Number of Data Science Use Cases
- Present an Overview of Python
- Describe the Components of the Big Data Scientific Stack
LABS
- Using I Python
- Data Analysis with Python
- Using HDFS Commands
- Introduction to Spark REPLs and Zeppelin
- Using Apache Mahout for Machine Learning
Day 2: Working with Spark RDDs, Data Frames and SparkSQL, Visualization in Zeppelin
OBJECTIVES
- Explain What an RDD Is
- Explain How RDDs are Partitioned
- Create Manipulate and Restore RDDs
- Use Spark SQL to Create Tables
- Create an Application and Submit to the Cluster
LABS
- Create and Manipulate RDDs
- Create and Save Data Frames
- Build and Submit Spark Applications
Day 3: Machine Learning Algorithms, Natural Language Processing, and Spark MLlib
OBJECTIVES
- Describe Common Machine Learning Applications
- List the Pros and Cons of Various Algorithms
- Explain what Natural Language Processing is
- Explain the Feature Engineering Capabilities of Spark MLlib
LABS
- Use the Python Natural Language Toolkit (NLTK)
- Classify text using Naïve Bayes
- Compute K-nearest neighbors
- Creating a Spam Classifier with MLlib
- Sentiment Analysis with Spark MLlib
Quick stats about SLI?
Award-winning instructors
Over 50 training locations across North America
All dates are guaranteed-to-run
Contact this provider
Sunset Learning Institute (SLI) has been an innovative leader in developing and delivering authorized technical training since 1996. We develop and deliver scalable technical learning solutions to our technology partners and their customers to help optimize technology investments, improve job...