Library

Course: Learn Complete ElasticSearch with LogStash, Kibana, Hadoop,Hive, Pig & MapReduce

Learn Complete ElasticSearch with LogStash, Kibana, Hadoop,Hive, Pig & MapReduce

  • Life Time Access
  • Certificate on Completion
  • Access on Android and iOS App
About this Course

In this course, you will learn how to work with ElasticSearch in Hadoop ecosystem. You will also learn how to integrate Apache Hive with ElasticSearch, Apache Pig with ElasticSearch, LogStash and Kibana with ElasticSearch & more.

This comprehensive course focuses on building real world like data pipelines to move data from one system to another. A common practice for any data engineer. Once you learn how to move data from Hadoop to ElasticSearch, you will create real time business intelligence dashboards using Kibana.

Section 1 – Ingestion Flows (to ElasticSearch)

In this section of the course, you will learn to move data from various Hadoop applications (such as Hive, Pig, MR) & LogStash & load it into an index under ElasticSearch cluster. This is ideal use case for generating business analytics from your data. Here are four major topics that will be covered in this section:

  • Learn how to install Apache Hive on your computer and integrate it with ElasticSearch
  • Learn how to install Apache PIG on your computer and index data into ElasticSearch using Apache PIG
  • Learn how to load an index into ElasticSearch using Hadoop MapReduce (Java Program)
  • Learn how to make LogStash work with ElasticSearch to move data into an index

Section 2 – Egression Flows (from ElasticSearch)

In this section of the course, you will learn to use indexed data from an ElasticSearch cluster and load it back into Hadoop cluster. After data is loaded back into Hadoop, you will learn how to directly import it into Hive, Pig, M/R or LogStash. Here are four major topics that we will cover under this section:

  • Learn how to import an ElasticSearch index directly into Apache Hive table
  • Learn how to import an ElasticSearch indexed data into Hadoop using Apache PIG scripts
  • Learn how to import an ElasticSearch indexed data into Hadoop using Java MapReduce program
  • Learn how to import an ElasticSearch indexed data using LogStash application

Section 3 – Data Visualization

In part of the course, you will learn how to use indexed data from an ElasticSearch cluster and create dyanmic dashboards using Kibana.

This will be a very important lesson for Data Analysts and Data Scientists.

Section 4 – Production Cluster Monitor tool

No knowledge is complete without learning how to maintain an application in production. In this section of the course, you will learn how to monitor your ElasticSearch cluster using Marvel plugins. Here are few things that you will learn:

  • Cluster Health monitoring at Index, Shard, Node levels
  • Parsing ElasticSearch Cluster statistics using Linux utilities
  • Setting up wait-for-trigger mechanism and much more

So, let's get started Now.

Basic knowledge
  • This will be an excellent course for anyone who wants to learn about Big Data technologies and how to use them together in order to create amazing Big Data applications
What you will learn
  • Learn A-Z of Integrating ElasticSearch & Hadoop with Hands-on Examples of Building Data Pipelines using Hive, PIG, LogStash & MapReduce
Curriculum
Number of Lectures: 145
Total Duration: 05:57:03
Building Foundation
  • Session 1: Introduction  
  • Lets talk about Search  
  • Course Resources  
  • What is a Search Engine  
  • Inside a Search Engine  
  • What is MetaData  
  • ElasticSearch in a Nutshell  
  • How ElasticSearch offers Scalability  
  • ElasticsSearch provides High Availability  
  • Multi-Tenancy Out of the Box  
  • Full Text Search inside ElasticSearch  
  • Real Time Analytics with ElasticSearch  
  • Session Summary  
  • Quiz - 1  
Setup a Working Environment
  • Session 2: Introduction  
  • Tools we will need  
  • [Hands-on] - Installing wget  
  • [Hands-on] - Lets start with installing Homebrew  
  • [Hands-on] - Check Java Version installed  
  • [Hands-on] - Check and Enable SSH  
  • [Hands-on] - Getting Latest Apache Hadoop  
  • [Hands-on] - Configuring Apache Hadoop Part - 1  
  • [Hands-on] - Configuring Apache Hadoop Part - 2  
  • [Hands-on] - Lets Bring Hadoop Daemons to Life  
  • [Hands-on] - Configuring Apache Hadoop Part - 3  
  • [Hands-on] - Getting ElasticSearch  
  • [Hands-on] - Configuring ElasticSearch  
  • [Hands-on] - Installing ElasticSearch Plugins: <Head Plugin>  
  • [Hands-on] - Installing ElasticSearch Plugins: <Marvel Plugin>  
  • [Hands-on] - Starting ElasticSearch Daemon and Checking out Plugins  
  • Session 2: Summary  
  • Quiz 2  
Building Blocks of ElasticSearch
  • Session 3: Introduction  
  • RDBMS vs ElasticSearch  
  • How a document (data record) looks likes inside ElasticSearch  
  • Marvels of Inverted Index  
  • Shards - low level storage units in ElasticSearch  
  • An ElasticSearch Node  
  • ElasticSearch Cluster  
  • Monitoring ElasticSearch Cluster's Health  
  • How to Scale an ElasticSearch Cluster  
  • RestAPIs in ElasticSearch  
  • Session 3: Summary  
  • Quiz 3  
Operations in ElasticSearch
  • Session 4: Introduction  
  • A look at ShardID  
  • Types of Operation in ElasticSearch  
  • Inside Operations: <Write> & <Delete>  
  • [Hands-on] - Lets try a Write Operation  
  • Inside Operations: <Read>  
  • [Hands-on] - Lets try a Read Operation Part-1  
  • [Hands-on] - Lets try a Read Operation Part-2  
  • Inside Operation: <Update>  
  • [Hands-on] - Lets try an Update Operation  
  • [Hands-on] - Lets try a Delete Operation  
  • Concept of Mapping in ElasticSearch  
  • [Hands-on] - How to add Mappings on Indexed data  
  • Data consistency using Templates  
  • [Hands-on] - Using Templates  
  • Session 4: Summary  
  • Quiz 4  
Queries in ElasticSearch
  • Session 5: Introduction  
  • Types of Search Queries  
  • [Hands-on] - Creating Dataset for Search  
  • [Hands-on] - Using QueryString Part-1: Select All  
  • [Hands-on] - Using QueryString Part-2: Filter Specific Fields  
  • Word of Caution with QueryStrings  
  • [Hands-on] - Using DSL Queries Part-1  
  • [Hands-on] - Using DSL Queries Part-2  
  • Session 5: Summary  
Data Pipelines
  • Basics about a Data Pipeline  
Data Pipeline#1 - Apache Hive to ElasticSearch
  • Setting Objectives  
  • [Hands-on] - Installing Apache Hive  
  • [Hands-on] - Configuring Apache Hive Part-1  
  • [Hands-on] - Configuring Apache Hive Part-2  
  • Getting ElasticSearch Connector JAR  
  • Where to get free datasets for exercises  
  • Understanding Our Dataset  
  • [Hands-on] - Creating a Data Flow from Hive to ElasticSearch Index Part-1  
  • [Hands-on] - Creating a Data Flow from Hive to ElasticSearch Part-2  
  • [Hands-on] - Looking at ingested data inside ElasticSearch Cluster  
  • Quiz 5  
Data Pipeline#2 - ElasticSearch to Apache Hive
  • Introduction  
  • Session 8: Setting Objectives  
  • [Hands-on] - Indexing Data inside ElasticSearch using Bulk API  
  • [Hands-on] - Creating Data Flow from ElasticSearch Index to Hive table  
  • Session 8: Summary  
  • Quiz 6  
Data Pipeline#3 - Apache PIG to ElasticSearch
  • Session 9: Introduction  
  • Session 9: Setting Objectives  
  • Basics about Apache PIG  
  • [Hands-on] - Installing Apache PIG  
  • [Hands-on] - Configuring Apache PIG  
  • Lets up a level by introducing Apache Hive into picture  
  • [Hands-on] - Getting Dataset for the exercise  
  • [Hands-on] - Creating data flow from Apache PIG to ElasticSearch Part-1  
  • [Hands-on] - Creating data flow from Apache PIG to ElasticSearch Part-2  
  • Quiz 7  
Data Pipeline#4 - ElasticSearch to Apache PIG
  • Session 10: Introduction  
  • [Hands-on] - Getting the Dataset  
  • [Hands-on] - Creating a Data Flow from ElasticSearch to Apache PIG to HDFS  
  • Session 10: Summary  
Data Pipeline#5 - MapReduce to ElasticSearch
  • Session 11: Introduction  
  • Basics about MapReduce  
  • Pre-requisites  
  • Key Classes in MapReduce & ElasticSearch Flow  
  • [Hands-on] - Getting and Saving Dataset to HDFS  
  • [Hands-on] - Creating an ElasticSearch Index and Mapping  
  • [Hands-on] - Creating a Maven POM.xml file  
  • [Hands-on] - Creating a Mapper Class  
  • [Hands-on] - Creating a MapReduce Driver Class  
  • [Hands-on] - Building MapReduce Program using Maven  
  • [Hands-on] - Running MapReduce Application on Hadoop cluster  
  • Quiz 8  
Data Pipeline#6 - ElasticSearch to MapReduce
  • Session 12: Setting Objectives  
  • Getting Dataset for the exercise  
  • [Hands-on] - Creating Mapper Class  
  • [Hands-on] - Creating Driver Class  
  • [Hands-on] - Building ES2MapReduce Program using Maven  
  • [Hands-on] - Upload DSL query file to HDFS  
  • [Hands-on] - Running ES2MR Application on Hadoop cluster  
  • Session 12: Summary  
Data Pipeline#7 - LogStash to ElasticSearch
  • Session 13:Objectives  
  • Basics about LogStash  
  • [Hands-on] - Installing LogStash  
  • [Hands-on] - Configuring LogStash  
  • [Hands-on] - Creating a simple data pipeline: STDIN to LogStash to ElasticSearch  
  • [Hands-on] - Creating data pipeline using a File  
  • LogStashing same file multiple times  
  • Quiz 9  
Data Pipeline#8 - ElasticSearch to LogStash
  • [Hands-on] - Creating a data pipeline from ElasticSearch Index to LogStash  
  • Session 14: Summary  
Data Visualization
  • Session 15: Setting Objectives  
  • Getting the Dataset  
  • Basics about Data Visualization  
  • [Hands-on] - Installing Kibana  
  • [Hands-on] - Configuring Kibana  
  • [Hands-on] - Starting Kibana  
  • [Hands-on] - Creating Visualization Part-1: Histogram Chart  
  • [Hands-on] - Creating Visualization Part-2: Donut Chart  
  • [Hands-on] - Creating Dashboard using individual Visualizations  
  • Session 15: Summary  
  • Quiz 10  
Monitoring ElasticSearch Cluster
  • Session 16: Introduction  
  • [Hands-on] - Marvel Dashboard  
  • [Hands-on] - How to extend Marvel's Free Trial  
  • [Hands-on] - Monitoring Overall Cluster Health  
  • [Hands-on] - Monitoring Cluster Health at Index Level details  
  • [Hands-on] - Monitoring Cluster Health at Shard Level details  
  • Wait for Trigger utility  
  • [Hands-on] - Using Head Plugin  
  • [Hands-on] - Monitoring Pending tasks in ElasticSearch Cluster  
  • [Hands-on] - Using CAT API  
  • Session 16: Summary  
  • Quiz 11  
Where to go from here?
  • Concluding Thoughts  
Reviews (0)