PySpark in Microsoft Fabric - Introduction (Ep. 1)

Austin Libal Jul 20, 2025

PySpark in Microsoft Fabric - Introduction (Ep. 1)

Austin Libal walks viewers through the foundational steps of using PySpark within Microsoft Fabric. This session is ideal for beginners looking to explore data engineering and analytics using Spark notebooks in Fabric.

What You Need to Begin

A Microsoft Fabric trial or full license
A Fabric-enabled workspace

Creating Your First Lakehouse

Use the Persona Switcher to switch to the Data Engineering persona.
Create a new Lakehouse (e.g., “Lakehouse PySpark”).
Upload a sample CSV file (e.g., holiday.csv) to the Lakehouse’s Files folder.
Drag and drop the file into the Tables folder to auto-generate a table.

Understanding Spark and PySpark

Austin explains that Spark is a distributed computing framework that allows for in-memory data processing using clusters. PySpark is the Python API for Spark, enabling users to write Spark applications using Python.

Working with Notebooks

Open a new notebook from within the Lakehouse interface.
Use the notebook to interact with your data using PySpark.
Drag the holiday table into a code cell to auto-generate PySpark code.
Run the cell to create a DataFrame and load data into memory.

Notebook Features

Switch between code and markdown cells.
Use the Home ribbon to manage language settings and run options.
Supported languages include PySpark, Scala, Spark SQL, and SparkR.

Creating and Using DataFrames

To work with data in Spark, users create DataFrames. Austin demonstrates how to:

Generate a DataFrame by dragging a table into a code cell.
Run the cell to execute a Spark job and load data.
Use the df.show() method to display data in a tabular format.

Don't forget to check out the Pragmatic Works' on-demand learning platform for more insightful content and training sessions on Fabric and other Microsoft applications. Be sure to subscribe to the Pragmatic Works YouTube channel to stay up-to-date on the latest tips and tricks.

Sign-up now and get instant access

Start a 7-Day Free Trial

ABOUT THE AUTHOR

Austin Libal

Austin is a Jacksonville native who graduated from The Baptist College of Florida in 2012. He previously worked as a manager in the retail service industry. He enjoys spending time with his wife and two kids. His primary focus at Pragmatic Works is on Azure Synapse Analytics and teaching the best practices for data integration, enterprise data warehousing, and big data analytics. He also enjoys helping customers learn the ins and outs of Power BI and showing people new ways to grow their business with the Power Platform.