Newsletter
Join our blog
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Start with the FREE community plan and get your lifetime access to 20+ courses. Get Instant Access Now!
Need help? Talk to an expert: (904) 638-5743
Private Training
Customized training to master new skills and grow your business.
On-Demand Learning
Beginner to advanced classes taught by Microsoft MVPs and Authors.
Bootcamps
In-depth boot camps take you from a novice to mastery in less than a week.
Season Learning Pass
Get access to our very best training offerings for successful up-skilling.
Stream Pro Plus
Combine On-Demand Learning platform with face-to-face Virtual Mentoring.
Certification Training
Prepare and ace your next certification with CertXP.
Private Training
Cheat Sheets
Quick references for when you need a little guidance.
Nerd Guides
Summaries developed in conjunction with our Learn with the Nerds sessions.
Downloads
Digital goodies - code samples, student files, and other must have files.
Blog
Stay up-to-date on all things Power BI, Power Apps, Microsoft 365 and Azure.
Community Discord Server
Start here for technology questions to get answers from the community.
Career Guides
Breaking into the field? Let these guides help get you started with a plan.
Affiliate Program
Earn money by driving sales through the Pragmatic Works' Training Affiliate Program.
Reseller Partner
It's time to address your client's training needs.
Foundation
Learn how to get into IT with free training and mentorship.
Management Team
Discover the faces behind our success: Meet our dedicated team
Contact Us
How can we help? Connect with Our Team Today!
FAQs
Find all the information you’re looking for. We’re happy to help.
A common discussion we’ve had lately is about using Azure Databricks within Azure Data Factory for ETL.
Why would you consider using Databricks, particularly in Azure Data Factory, as part of your ETL processing? Let me tell you three use cases:
1. For integrating Machine Learning into your processing. With Databricks we can use scripts to integrate or execute machine learning models. This makes it simple to feed a dataset into a machine learning model and then use Databricks to render a prediction for example. Then you can output the results of that prediction into a table in SQL Server.
2. Use Databricks tooling and code for doing transformations. Azure Data Factory currently has Dataflows, which is in preview, that provides some great functionality. But if you want to write some custom transformations using Python, Scala or R, Databricks is a great way to do that.
3. Using Data Lake or Blob storage as a source. If your source data is in either of these, Databricks is very strong at using those types of data. It is designed for querying and processing large volumes of data, particularly if they are stored in a system like Data Lake or Blob storage.
My diagram below shows a sample of what the second and third use cases above might look like.
The top portion shows a typical pattern we use, where I may have some source data in Azure Data Lake, and I would use a copy activity from Data Factory to load that data from the Lake into a stage table. Using either a SQL Server stored procedure or some SSIS, I would do some transformations there before I loaded my final data warehouse table.
The bottom portion shows how I could use Databricks to query that data out of Data Lake and put it into the Databricks cluster. Then within my Databricks cluster, I can perform my transformations using Databricks code and logic.
I could then use Databricks to output that transformed data directly into my data warehouse table. Which pattern you use depends a lot on the data that you have and the transformations you want to use.
I wanted to share these three real-world use cases for using Databricks in either your ETL, or more particularly, with Azure Data Factory.
If you have any questions about Azure Databricks, Azure Data Factory or about data warehousing in the cloud, we’d love to help. Click the link below or contact us, our Azure experts are ready to help you no matter where you are on your cloud journey.
ABOUT THE AUTHOR
Free Trial
private training
Newsletter
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Leave a comment