NEED HELP? TALK TO AN EXPERT (904) 638-5743

Introduction to Apache Spark

Part Of The On-Demand Courses

WHAT YOU'LL LEARN

  • Learn the basics concepts behind Spark
  • Learn how to work with datasets in Spark
  • Learn how to use the Eclipse IDE to write programs
This course includes:
  • Course Description
  • Outline
  • Instructor
  • System Requirements
  • What to Know Before Class

Course Description

Introduction to Apache Spark is designed to introduce you to one of the most important Big Data technologies on the market, Apache Spark. You will start by learning about some of the basic concepts behind Spark, including the Resilient Distributed Datasets which tie everything together. From there, you will learn how to work with datasets in Spark using a functional programming approach as well as SQL. Finally, you will learn how to use the Eclipse IDE to write programs to work with data, learning a common technique for deploying code for Apache Spark jobs.

 

Outline

 

Module 00 | Introduction to Apache Spark
08m 21s total

 

Module 00A | Introduction to Apache Spark
08m 21s

 

Module 01 | Getting Started with Apache Spark
87m 94s total

 

Module 01A | Introduction
38m 50s

 

Module 01B | Installing Spark and IntelliJ IDEA
49m 44s

 

Module 02 | Learning with Spark-Shell
129m 03s total

 

Module 02A | Introduction
36m 57s

 

Module 02B | Key Spark Functions
58m 53s

 

Module 02C | Reviewing the Word Count App
21m 49s

 

Module 02D | Custom Functions
12m 44s

 

Module 03 | Spark SQL
200m 56s total

 

Module 03A | Introduction
30m 21s

 

Module 03B | Functional Spark SQL
51m 26s

 

Module 03C | The Query Approach
44m 37s

 

Module 03D | The Combines Approach
52m 37s

 

Module 03E | User Defined Functions
22m 35s

 

Module 04 | Administration
126m 06s total

 

Module 04A | Deploying Spark Jobs
26m 50s

 

Module 04B | IntelliJ IDEA
22m 08s

 

Module 04C | Passing in Parameters
25m 50s

 

Module 04D | Debugging with IntelliJ IDEA
13m 57s

 

Module 04E | Expanding Projects
38m 41s

Instructor

MITCHELL PEARSON

Training Content Manager
Mitchell Pearson has been with Pragmatic Works for 8 years as a Data Platform Consultant and the Training Content manager. Mitchell has authored books on SQL Server, Power BI and the Power Platform. Data Platform experience includes designing and implementing enterprise level Business Intelligence solutions with the Microsoft SQL Server stack (T-SQL, SSIS, SSAS, SSRS), the Power Platform and Microsoft Azure.

 

Click Here To Learn More About Mitchell And See Certifications And Reviews

System Requirements

Spark does not publish minimum requirements for single-node machines like VMs or laptops, but at least 8 GB of RAM is recommended. Spark can run on any edition of Windows, Linux, or Mac OS which supports Oracle Java 1.8.

What to Know Before Class

The target audience of this course is an application or database developer interested in learning about Big Data technologies. No knowledge of Spark or Hadoop is assumed. Knowledge of development languages like Java, C#, or Python are helpful but not required.

REVIEWS

Pending reviews - check back later

Pending reviews - check back later

Pending reviews - check back later

Start your 7 day free trial

Pragmatic Works On-Demand Learning Platform gives you access to 60+ courses such as Power BI, Azure, SQL Server, Data Science, Business Intelligence Power Apps and more.

Instant Download | No Credit Card Required