Library

Course: Apache Spark With Examples for Big Data Analytics

Apache Spark With Examples for Big Data Analytics

  • Life Time Access
  • Certificate on Completion
  • Access on Android and iOS App
About this Course

This course covers all the fundamentals you need to write complex Spark applications. By the end of this course you will get in-depth knowledge on Spark core,Spark SQL,Spark Streaming.

This course is divided into 9 modules

  • Dive Into Scala - Understand the basics of Scala that are required for programming Spark applications.Learn about the basic constructs of Scala such as variable types, control structures, collections,and more.
  • OOPS and Functional Programming in Scala - Learn about object oriented programming and functional programming techniques in Scala
  • Introduction to Apache Spark - Learn Spark Architecture,Spark Components and spark use-cases
  • Spark Basics - Learn how to configure/run spark in eclipse/intellij
  • Working with RDDs in Spark - Learn what is Resilient Distributed Dataset,Different types of actions and transformations which can be applied on RDDs
  • Aggregating Data with Pair RDDs - Learn how Pair RDD is different from RDD,Different types of actions and transformations which can be applied on Pair RDDs
  • Advanced Spark Concepts - Learn how Spark uses Broadcast variables and Accumulators to perform calculations,how persistence and partitioning helps to achieve performance
  • Spark SQL and Data Frames - Understand the difference between Dataframe and Dataset
  • Spark Streaming - Learn how to analyse massive amount of dataset on the fly

All the concepts are explained using hands-on examples.This course covers 10+ hands-on big data examples such as

  • Explore player data from 2014 world cup
  • Aggregate data from ebay online auction data
  • Understand different data points from Adhaar data
  • Develop application to analyse funds received by Indian startup
  • Explore the price trend by looking at the real estate data in California
  • Help retailer to find out valid and invalid purchase transactions of chain of stores in Bangalore
  • Write Spark program find out count of stores in each US region from USA states & Store locations data
  • Develop Spark Streaming application to perform Twitter Sentiment Analysis
Basic knowledge
  • Basic programming skills
  • A computer running Windows, OSX or Linux
  • The software needed for this course is freely available and detailed steps to install and configure software is include in the course
What you will learn
  • Get clear understanding of the limitations of MapReduce and role of Spark in overcoming these limitations
  • Understand fundamentals of Scala Programming Language and it’s features
  • Expertise in using RDD for creating applications in Spark
  • Mastering SQL queries using SparkSQL
  • Gain thorough understanding of Spark Streaming features
Curriculum
Number of Lectures: 41
Total Duration: 03:49:27
Dive Into Scala
  • Introduction to scala  
  • Environment Setup  
  • Hello Scala  
  • Flow Controls  
  • Functions and operators  
  • OOPS concepts  
  • Traits  
  • Arrays  
  • Collections  
Introduction To Apache Spark
  • BigData and Need for Apache Spark  
  • What is Spark,Spark Features and Spark Eco System  
  • Spark Architecture  
  • Spark Usecases  
Spark Configuration
  • Setup Environment  
  • Word Count Spark Program  
Working with RDDs in Spark
  • What is RDD & How to Create  
  • Transformations - filter & map  
  • Solving Cars By Mileage problem using map and filter transformations  
  • Solving Cars In America problem using map and filter transformations  
  • Transformations - flatmap,union & intersection  
  • Analysis on 2014 football world cup player information  
  • RDD Actions  
  • Nasa Access Logs Analysis  
Aggregating data with pair RDDs
  • Pair RDD - How to Create,reduceByKey  
  • groupByKey and reduceBykey vs groupByKey  
  • Transformations - mapvalues sortbykey countbykey  
  • Analysis on 2015 Indian Startup funding information  
  • Analysis on real estate data using pair rdd operations  
Advanced Spark Concepts
  • Broadcast Variables  
  • Accumulators  
  • Persistence and Caching  
  • Partitioning  
Spark SQL
  • What is Spark SQL  
  • DataFrames  
  • DataSets  
  • Ebay Auction Data Analysis  
  • Adhaar Data Analysis  
Spark Streaming
  • What is Spark Streaming?  
  • DStreams  
  • Spark Streaming Example  
  • Twitter Sentiment Analysis  
Reviews (0)