<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=612681139262614&amp;ev=PageView&amp;noscript=1">
Skip to content

Need help? Talk to an expert: phone(904) 638-5743

Getting Started with PySpark in Databricks

Learn PySpark basics for big data on Databricks: read/write data, work with numbers/strings, use data frames, clean/transform data.

Getting Started with PySpark in Databricks
  • Course Info
  • Instructor
  • What to know beforehand
  • System Requirements

Course Description

This class will cover the basics of PySpark, a language for working with big data on Databricks. You will learn how to read and write data from different sources, how to work with numerical and string data, how to use data frames for data manipulation and analysis, and how to clean and transform data and handle null values. This class will help you get familiar with the PySpark language and its capabilities for data engineering and analysis.

Course Outline ( Free Preview)

Module 00 - Class Introduction and Files 9 min.

Module 01 - Provisioning Databricks

Module 02 - Introduction to PySpark

Module 03A - Working with Strings 25 min.

Module 03B - Working with Numbers 13 min.

Module 04 - Working with DataFrames 25 min.

Module 05 - Querying in PySpark 30 min.

Module 06 - Writing Data in PySpark 23 min.

Module 07 - Filtering Data 31 min.

Module 08 - Aggregations 35 min.

Module 09 - Working with Null Values 32 min.

This course includes:

  • 4+ hours of training
  • 11 Modules
  • * Access on mobile and browser
  • Certificate of Completion