Library

Course: Apache Spark With Examples for Big Data Analytics

Apache Spark With Examples for Big Data Analytics

  • Life Time Access
  • Certificate on Completion
  • Access on Android and iOS App
  • Self-Paced
About this Course

This course covers all the fundamentals you need to write complex Spark applications. By the end of this course you will get in-depth knowledge on Spark core,Spark SQL,Spark Streaming.

This course is divided into 9 modules

  • Dive Into Scala - Understand the basics of Scala that are required for programming Spark applications.Learn about the basic constructs of Scala such as variable types, control structures, collections,and more.
  • OOPS and Functional Programming in Scala - Learn about object oriented programming and functional programming techniques in Scala
  • Introduction to Apache Spark - Learn Spark Architecture,Spark Components and spark use-cases
  • Spark Basics - Learn how to configure/run spark in eclipse/intellij
  • Working with RDDs in Spark - Learn what is Resilient Distributed Dataset,Different types of actions and transformations which can be applied on RDDs
  • Aggregating Data with Pair RDDs - Learn how Pair RDD is different from RDD,Different types of actions and transformations which can be applied on Pair RDDs
  • Advanced Spark Concepts - Learn how Spark uses Broadcast variables and Accumulators to perform calculations,how persistence and partitioning helps to achieve performance
  • Spark SQL and Data Frames - Understand the difference between Dataframe and Dataset
  • Spark Streaming - Learn how to analyse massive amount of dataset on the fly

All the concepts are explained using hands-on examples.This course covers 10+ hands-on big data examples such as

  • Explore player data from 2014 world cup
  • Aggregate data from ebay online auction data
  • Understand different data points from Adhaar data
  • Develop application to analyse funds received by Indian startup
  • Explore the price trend by looking at the real estate data in California
  • Help retailer to find out valid and invalid purchase transactions of chain of stores in Bangalore
  • Write Spark program find out count of stores in each US region from USA states & Store locations data
  • Develop Spark Streaming application to perform Twitter Sentiment Analysis
Basic knowledge
  • Basic programming skills
  • A computer running Windows, OSX or Linux
  • The software needed for this course is freely available and detailed steps to install and configure software is include in the course
What you will learn
  • Get clear understanding of the limitations of MapReduce and role of Spark in overcoming these limitations
  • Understand fundamentals of Scala Programming Language and it’s features
  • Expertise in using RDD for creating applications in Spark
  • Mastering SQL queries using SparkSQL
  • Gain thorough understanding of Spark Streaming features
Curriculum
Number of Lectures: 41
Total Duration: 03:49:27
Dive Into Scala
  • Introduction to scala  
  • Environment Setup  
  • Hello Scala  
  • Flow Controls  
  • Functions and operators  
  • OOPS concepts  
  • Traits  
  • Arrays  
  • Collections  
Introduction To Apache Spark
  • BigData and Need for Apache Spark  
  • What is Spark,Spark Features and Spark Eco System  
  • Spark Architecture  
  • Spark Usecases  
Spark Configuration
  • Setup Environment  
  • Word Count Spark Program  
Working with RDDs in Spark
  • What is RDD & How to Create  
  • Transformations - filter & map  
  • Solving Cars By Mileage problem using map and filter transformations  
  • Solving Cars In America problem using map and filter transformations  
  • Transformations - flatmap,union & intersection  
  • Analysis on 2014 football world cup player information  
  • RDD Actions  
  • Nasa Access Logs Analysis  
Aggregating data with pair RDDs
  • Pair RDD - How to Create,reduceByKey  
  • groupByKey and reduceBykey vs groupByKey  
  • Transformations - mapvalues sortbykey countbykey  
  • Analysis on 2015 Indian Startup funding information  
  • Analysis on real estate data using pair rdd operations  
Advanced Spark Concepts
  • Broadcast Variables  
  • Accumulators  
  • Persistence and Caching  
  • Partitioning  
Spark SQL
  • What is Spark SQL  
  • DataFrames  
  • DataSets  
  • Ebay Auction Data Analysis  
  • Adhaar Data Analysis  
Spark Streaming
  • What is Spark Streaming?  
  • DStreams  
  • Spark Streaming Example  
  • Twitter Sentiment Analysis  
Reviews (0)