This course on Apache Spark and Scala aims at providing an advanced expertise in big data Hadoop ecosystem. This course will provide a standard skillset which helps one become a specialist on the top of Big data Hadoop developer.
The course starts with a detailed description on limitations of mapreduce and how Spark can help overcome them. Further it covers a deeper dive into the Scala programming language.
Moving on it covers Spark as a standalone cluster and an understanding of Resiliient Distributed Datasets.
The course also covers concepts of Spark SQL using SQL queries through SQL context and Hive Queries through Hive context.
This course certainly provides material required for building a career path from Big data Hadoop developer to BIg data Hadoop architect.
- Prior knowledge of Apache Hadoop will be an added advantage, but not compulsory.
- Fundamental understanding of any programming language
- Understand the limitations of Hadoop mapreduce and how Spark overcomes these limitations
- Gain expertise in Scala programming language and its characteristics
- Able to work with RDDs' and create applications in Spark
- A thorough understanding about Spark SQL by using SQL queries in Spark