Sign-up now and get instant access
Leave a comment
Customized training to master new skills and grow your business.
Beginner to advanced classes taught by Microsoft MVPs and Authors.
In-depth boot camps take you from a novice to mastery in less than a week.
Season Learning Pass
Get access to our very best training offerings for successful up-skilling.
Stream Pro Plus
Combine On-Demand Learning platform with face-to-face Virtual Mentoring.
Quick references for when you need a little guidance.
Summaries developed in conjunction with our Learn with the Nerds sessions.
Digital goodies - code samples, student files, and other must have files.
Stay up-to-date on all things Power BI, Power Apps, Microsoft 365 and Azure.
Earn money by driving sales through the Pragmatic Works' Training Affiliate Program.
It's time to address your client's training needs.
Learn how to get into IT with free training and mentorship.
Discover the faces behind our success: Meet our dedicated team
How can we help? Connect with Our Team Today!
Find all the information you’re looking for. We’re happy to help.
In today’s post I’ll look at some considerations for choosing to use Azure Blob Storage or Azure Data Lake Store when processing data to be loaded into a data warehouse. My basis here is a reference architecture that Microsoft published, see diagram below.
The diagram shows a typical pattern and what caught my eye was that it suggests loading data from your source system into Azure Blob Storage. On a couple projects, we are using Azure Data Lake Store instead of Azure Blob Storage. So, this got me thinking and here are my thoughts on why you may choose one over the other.
In many cases they are very similar and in many cases it’s the classic ‘it depends’. Ultimately, in most cases you can’t go wrong either way. One difference I see is it comes down to the type of files that each are good at working with.
I think blob storage is good at non-text based files – database backups, photos, videos and audio files. Whereas data lake I feel is a bit better at large volumes of text data. More often than not, personally, I would choose Data Lake Store if I’m using text file data to be loaded into my data warehouse. Of course, you can use blob storage, but I feel that is for those non-text data that I mentioned above.
There are tradeoffs with both. One thing Azure Blob Storage currently has over Azure Data Lake is the availability to geographic redundancy. You can set this up yourself with Data Lake by setting up a job to periodically replicate your Data Lake Store data to another geographic region, but it’s not available out of the box as with Blob Storage. If geo redundant storage is an important feature, then Blob Storage is the way to go.
Your data is secure in blob storage or Data Lake, but what Data Lake has over Blob Storage is that it works with Azure Active Directory; Blob storage currently does not. So, if you’re using Active Directory, that will integrate well with Data Lake from a security perspective. Bottom line is they are both secure, it’s just a matter of a different method of accessing it; you would access your data in blob storage though keys instead of Active Directory.
Depending on your workload, having your data in Data Lake Store will provide some additional opportunities for analytics, specifically Azure Data Lake Analytics. This gives you the ability to use SQL to do some neat analytics on top of data in your Data Lake Store, which obviously you couldn’t do in Blob.
How about pricing? Generally, Data Lake will be a bit more expensive although they are in close range of each other. Blob storage has more options for pricing depending upon things like how frequently you need to access your data (cold vs hot storage). Data Lake is priced on volume, so it will go up as you reach certain tiers of volume.
Either way, you can’t go wrong, but when Microsoft published this reference architecture, I thought it was an interesting point to make. There are many ways to approach this, but I wanted to give my thoughts on using Azure Data Lake Store vs Azure Blob Storage in a data warehousing scenario.
If you’d like to learn more about this topic or anything Azure related, we’re here to help. Click the link below or contact us, our team is ready and excited to help you where ever you are on your Azure journey.
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.