Newsletter
Join our blog
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Start with the FREE community plan and get your lifetime access to 20+ courses. Get Instant Access Now!
Need help? Talk to an expert: (904) 638-5743
Private Training
Customized training to master new skills and grow your business.
On-Demand Learning
Beginner to advanced classes taught by Microsoft MVPs and Authors.
Bootcamps
In-depth boot camps take you from a novice to mastery in less than a week.
Season Learning Pass
Get access to our very best training offerings for successful up-skilling.
Stream Pro Plus
Combine On-Demand Learning platform with face-to-face Virtual Mentoring.
Certification Training
Prepare and ace your next certification with CertXP.
Private Training
Cheat Sheets
Quick references for when you need a little guidance.
Nerd Guides
Summaries developed in conjunction with our Learn with the Nerds sessions.
Downloads
Digital goodies - code samples, student files, and other must have files.
Blog
Stay up-to-date on all things Power BI, Power Apps, Microsoft 365 and Azure.
Community Discord Server
Start here for technology questions to get answers from the community.
Career Guides
Breaking into the field? Let these guides help get you started with a plan.
Affiliate Program
Earn money by driving sales through the Pragmatic Works' Training Affiliate Program.
Reseller Partner
It's time to address your client's training needs.
Foundation
Learn how to get into IT with free training and mentorship.
Management Team
Discover the faces behind our success: Meet our dedicated team
Contact Us
How can we help? Connect with Our Team Today!
FAQs
Find all the information you’re looking for. We’re happy to help.
Yesterday’s Azure Every Day post covered how Azure Data Factory pricing works. In today’s post I’d like to go a bit deeper into Azure Data Factory Version 2 and review pipelines and activities. In essence, a pipeline is a logical grouping of activities. If you’re familiar with SSIS, think of an SSIS package being a grouping of activities that are happening with the data.
An example of a pipeline would look like: you want to pull data from a website, file server or database up into Azure and do some kind of transformation on that data, then report from it. Within the pipeline, multiple activities can be defined. If there’s no activity dependency on a set of activities – so you have one activity running and there’s no dependency on the next activity -then they can run in parallel.
This is good to keep in mind as you’re performing these activities because you may need to schedule them or figure out a way, so they don’t run in parallel or that one runs after another.
There are 3 main types of activities:
1. Data Movement Activities – This is the sources where you’re pulling in data from such as Azure Blob Storage, Azure Data Lake, Azure DB and DW. You can also set up an on premises gateway and pull in databases, such as commonly used DB2, MySQL, Oracle, SAP, Sybase and Teradata, as well as NoSQL databases like Cassandra and MongoDB.
I also mentioned files; you can pull from Amazon, S3, file systems, FTP, HTTP, etc. You also have the Software as a Service (SaaS) options: Dynamics, HubSpot, Marketo, QuickBooks, and Salesforce, to name a few. You can check a complete list on the Azure online documentation.
2. Data Transformation Activities – Here is where you’re taking your data after it’s ingested into Azure and doing something with it. Some common ones are HDInsight, HIVE, PIG, MapReduce, Hadoop Streaming and Spark transformations. These allow you to transform your big data in your Azure environment and stage it for your reporting.
Other common uses would be machine learning into an Azure VM, as well as stored procedures. You can have your stored procedures in SQL Server defined in Azure, and then run that stored procedure, and also use U-SQL for your Data Lake Analytics.
3. Control Activities – In these activities you can do things like execute your pipelines or run a ForEach statement or Look-up activities, the types of things where you’re controlling how the pipeline is working and interacting with the data.
Hopefully you found this helpful. If you have any questions about any of the topics I covered today or want general information around any Azure topic, we’re here to help. Click the link below or contact us to learn more about integrating Azure into your organization.
ABOUT THE AUTHOR
Free Trial
private training
Newsletter
Join other Azure, Power Platform and SQL Server pros by subscribing to our blog.
Leave a comment