Introduction to Apache Spark

Part Of The On-Demand Courses


  • Learn the basics concepts behind Spark
  • Learn how to work with datasets in Spark
  • Learn how to use the Eclipse IDE to write programs
This course includes:

Course Description

Introduction to Apache Spark is designed to introduce you to one of the most important Big Data technologies on the market, Apache Spark. You will start by learning about some of the basic concepts behind Spark, including the Resilient Distributed Datasets which tie everything together. From there, you will learn how to work with datasets in Spark using a functional programming approach as well as SQL. Finally, you will learn how to use the Eclipse IDE to write programs to work with data, learning a common technique for deploying code for Apache Spark jobs.




Module 00 | Introduction to Apache Spark
08m 21s total


Module 00A | Introduction to Apache Spark
08m 21s


Module 01 | Getting Started with Apache Spark
87m 94s total


Module 01A | Introduction
38m 50s


Module 01B | Installing Spark and IntelliJ IDEA
49m 44s


Module 02 | Learning with Spark-Shell
129m 03s total


Module 02A | Introduction
36m 57s


Module 02B | Key Spark Functions
58m 53s


Module 02C | Reviewing the Word Count App
21m 49s


Module 02D | Custom Functions
12m 44s


Module 03 | Spark SQL
200m 56s total


Module 03A | Introduction
30m 21s


Module 03B | Functional Spark SQL
51m 26s


Module 03C | The Query Approach
44m 37s


Module 03D | The Combines Approach
52m 37s


Module 03E | User Defined Functions
22m 35s


Module 04 | Administration
126m 06s total


Module 04A | Deploying Spark Jobs
26m 50s


Module 04B | IntelliJ IDEA
22m 08s


Module 04C | Passing in Parameters
25m 50s


Module 04D | Debugging with IntelliJ IDEA
13m 57s


Module 04E | Expanding Projects
38m 41s



Training Content Manager
Mitchell Pearson has been with Pragmatic Works for 8 years as a Data Platform Consultant and the Training Content manager. Mitchell has authored books on SQL Server, Power BI and the Power Platform. Data Platform experience includes designing and implementing enterprise level Business Intelligence solutions with the Microsoft SQL Server stack (T-SQL, SSIS, SSAS, SSRS), the Power Platform and Microsoft Azure.


Click Here To Learn More About Mitchell And See Certifications And Reviews

System Requirements

Spark does not publish minimum requirements for single-node machines like VMs or laptops, but at least 8 GB of RAM is recommended. Spark can run on any edition of Windows, Linux, or Mac OS which supports Oracle Java 1.8.

What to Know Before Class

The target audience of this course is an application or database developer interested in learning about Big Data technologies. No knowledge of Spark or Hadoop is assumed. Knowledge of development languages like Java, C#, or Python are helpful but not required.


Pending reviews - check back later

Pending reviews - check back later

Pending reviews - check back later

Start With The Free Community Plan

Pragmatic Works free community plan gives you lifetime access to 7 Microsoft “in a day” courses on Power BI, Excel, Power Apps, Azure Synapse, Power Automate, Paginated Reports and Chatbots.



• Get preview access to all 70+ courses & custom Learning Paths for 7 days

• Labs and files included

• Access courses from our mobile app or desktop

• Access quizzes to practice while you learn

Get Instant Access Now!