In the SQL space in Azure, we have a few options; we have Azure SQL Database and Azure SQL Data Warehouse, both are Platform as a Service (PaaS) options, and we also can run our SQL on VMs within Azure. But my focus is on Azure SQL Data Warehouse and what types of workloads are the best fit, and some that are not.
First, let’s understand what Azure SQL Data Warehouse is behind the scenes. This has an MPP (massively parallel processing) architecture. This big data architecture takes the data and distributes it across 60 different distributions. So, if we put a table within Azure SQL Data Warehouse, it’s going to take that data and split it up across those 60 different pieces.
This makes it great for large workloads and large queries across massive amounts of data, as it can take that query and split it up across different nodes and have each node run its part of the query and respond back up. But this also means it’s not a great fit for smaller workloads, databases or small queries.
For example, if you have a database that’s 1 or 2 terabytes and have a star schema around it, you’ll see great performance out of answering complex queries with Azure SQL Data Warehouse. However, if you have a database that’s one or two hundred gigabytes, even with a star schema, chances are you’ll see performance degradation. There’s not enough of a data workload to take advantage of the MPP architecture; in this case, you’re taking data and splitting it into too small chunks, which will slow down the query times.
Another scenario to consider is, let’s say you have an instance where workloads are up into the 2 terabytes region, but are backing up a website. An OLTP database, highly denormalized, but still very large, would not be a good fit for Azure SQL Data Warehouse. Because with lots of small tables and queries in an OLTP database, it will have to try to split every query up across those 60 distributions and it will run very slow.
So, with large tables, data warehouse schema and large amounts of data, you’ll see phenomenal speed with this platform, as well as be able to take advantage of the great capabilities within Azure. If you’d like to learn more about what data platform you should be using or talk about the type of data you have and where it fits in Azure – we’d love to talk to you. Click the link below and get started today.