Intro to big data with Apache Spark


Intro to big data with Apache Spark

Course overview

  • Want to be a Data Scientist?
  • Should make Data Analysis for increasing your profit?
  • Want to know how to deal with Big Data?
  • Need to apply Machine Learning algorithms but don’t know the right tools?

In our course, we will tell you about the main points of parallel, distributed and scalable machine learning. After successfully finishing all the Lectures and Projects, you will be able to process large data sets; to clean, transform and analyze structured and unstructured data; to build predictive models and make the evaluation of them.

Syllabus

WEEK 1. Introduction to Apache SparkLab 0. Virtual Machine installation

WEEK 2. Working with RDD and Key\Value pair of RDD

  • Lab 1. Basic operations with RDDs

WEEK 3. Spark SQL and Spark Streaming

  • Lab 2. Log files analysis with Apache Spark

WEEK 4. Machine Learning with MLlib

WEEK 5. Advanced Machine Learning

  • Lab 3. Predictive modeling with MLlib

WEEK 6. GraphX in Apache Spark

  • Lab 4. Introduction to Recommendation systems

WEEK 7. Big data: Use cases

WEEK 8. Final Project 

Do you want to join Data Science School?