Big data for DevOps


Big data for DevOps

Course overview

In this course, we will create a system for the analysis of log files and visualize the results in real time. We will use:

  • Kafka - to transmit new messages;
  • Spark Streaming - for real-time analysis of data received from the broker Kafka and recording the result in Cassandra.

What you'll learn

You will learn to install and configure Kafka broker, Apache Spark, and Cassandra. You will get acquainted with the implementation of Kafka Producer and Kafka consumer. We will consider Spark Streaming implementation to receive data from Kafka broker, as well as recording the processed data in Cassandra. Also, we will look at an example of data visualization obtained by Cassandra using Apache Zeppelin.

Syllabus

WEEK 1. Installation and configuration of Apache Spark

WEEK 2. Installation and configuration of Cassandra

WEEK 3. Installation and configuration of Kafka and ZooKeeper

WEEK 4. Spark Streaming with Python, Kafka, and Cassandra

WEEK 5. Bonus - NiFi

WEEK 6. Bonus - Ambari

Do you want to join Data Science School?