Newsletter
Join our blog
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Start with the FREE community plan and get your lifetime access to 20+ courses. Get Instant Access Now!
Need help? Talk to an expert: (904) 638-5743
Private Training
Customized training to master new skills and grow your business.
On-Demand Learning
Beginner to advanced classes taught by Microsoft MVPs and Authors.
Bootcamps
In-depth boot camps take you from a novice to mastery in less than a week.
Season Learning Pass
Get access to our very best training offerings for successful up-skilling.
Stream Pro Plus
Combine On-Demand Learning platform with face-to-face Virtual Mentoring.
Certification Training
Prepare and ace your next certification with CertXP.
Private Training
Cheat Sheets
Quick references for when you need a little guidance.
Nerd Guides
Summaries developed in conjunction with our Learn with the Nerds sessions.
Downloads
Digital goodies - code samples, student files, and other must have files.
Blog
Stay up-to-date on all things Power BI, Power Apps, Microsoft 365 and Azure.
Community Discord Server
Start here for technology questions to get answers from the community.
Career Guides
Breaking into the field? Let these guides help get you started with a plan.
Affiliate Program
Earn money by driving sales through the Pragmatic Works' Training Affiliate Program.
Reseller Partner
It's time to address your client's training needs.
Foundation
Learn how to get into IT with free training and mentorship.
Management Team
Discover the faces behind our success: Meet our dedicated team
Contact Us
How can we help? Connect with Our Team Today!
FAQs
Find all the information you’re looking for. We’re happy to help.
Looking for faster query performance? A while back, Microsoft introduced Azure SQL Data Warehouse Gen2 with five times the compute capacity, as well as the huge benefit of providing faster query performance than the Gen1 offering.
Let me tell you how Gen2 achieves this faster query performance. Check out the diagram below which shows the architecture of the Azure SQL Data Warehouse platform.
The key factor to understand is that this platform uses Massive Parallel Processing. The idea here is we would issue a query to one control node (as shown in the top). And that control node distributes that query across multiple compute nodes (as seen in the middle tier of this illustration). Those queries then run in parallel on those compute nodes to render a partial result that gets rolled up to that control node.
Now, let’s talk about the performance of Gen2 by focusing on the bottom of the diagram which is the storage. The data storage for Azure SQL Data Warehouse Gen1 is stored remotely in Azure Blob Storage. This is a key feature of the architecture as the compute power and the storage are managed independently and separately.
We still have the same fundamental architecture in Gen2 with our data still stored remotely in Blob Storage. But in this next diagram, I want to talk about what has changed in Gen2 to enable this faster query performance.
1. There was new hardware introduced to the Azure platform, so Gen2 is running on a new version and a new generation of hardware. Part of that includes SSDs on the compute nodes which is part of where we get the speed from.
2. But the real focus is on the introduction of adaptive caching as far as this faster query performance. The notion of adaptive caching means that when we use columnstore indexes, previously all of that data was stored remotely in blob storage.
What adaptive caching does is it brings certain columnstore segments into the compute node itself. So now depending on the query that I execute, if my query can be resolved by data that has been moved into the compute node, it reduces or eliminates the IO required to go out to that remote storage.
This adaptive caching is the real key of how Gen2 achieves that high query performance. I also want to point out about the word ‘adaptive’ is that columnstore indexes that are brought into the compute node will depend on past query execution, so it will understand the history of queries that you’ve run.
Using this history, it will decide to bring in those columns to our segments that might help you resolve a similar query in the future. It will also develop a pattern based on the thought of ‘if you’ve run that query, you may also run this query’, so it will bring those columns into our segments as well. The main idea here is to bring that data into the compute node and reduce or eliminate the need to read that data from the remote blob storage.
So, if you’re looking to get better, faster query performance, I’d advise you to check out Azure SQL Data Warehouse Gen2. If you have questions around Azure SQL Data Warehouse Gen2 or anything within the Azure platform, we’d love to help. Click the link below or contact us - no matter where you are on your Azure journey.
ABOUT THE AUTHOR
Free Trial
private training
Newsletter
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Leave a comment