How to Connect Azure Data Factory to Azure DevOps
Are you using Azure DevOps and want to know how to use it as a code repository? A benefit to using DevOps (or any code repository) is you can create a method to preserve the code from a working version while you’re making modification. In this post I’ll show you how to connect an existing Azure Data Factory project to an Azure DevOps code repository.
Azure Data Factory (ADF) uses JSON to capture the code in your Data Factory project and by connecting ADF to a code repository each of your changes will be tracked when you save them. Also, whenever you publish, DevOps will automatically establish a new version of the Data Factory, enabling you to rollback if needed.
Now on to my demo:
- I’ve created a simple Data Factory that counts and copies an author cable from an Azure SQL Database to an Azure Storage Blob.
- Azure DevOps supports two versions of a code repository: Azure DevOps and GitHub. In this demo, I’ll work with Azure DevOps.
- First step is to log into Azure DevOps (dev.azure.com) then click on New Project, fill in the default fields and click Create. You could go also into Advanced and change your version control and work item process (I’ll work in Agile).
- While that’s creating, I’ll go back into my Data Factory. In the upper left corner, you’ll see Data Factory and when you pull that drop-down, click on Set Up Code Repository.
- This will open Repository Settings where we can set up the connection to the code repository we just created. We need to:
- Select Repository Type: Azure DevOps Git
- Select Azure DevOps account that it’s associated with (my account in this case)
- Choose Project Name (the one we just created)
- Git Repository Name: We can create a new one or use the existing repository when we created it.
- Collaboration Branch: I suggest you stick with Master. This is where all your branching will merge back, as well as where a copy of all the changes you’ve made will be published to the Azure Data Factory that runs, via trigger or event.
- Then click Save.
- While this is saving, you’ll see on your ADF page that Save as Template is grayed out but underneath that you’ll see 2 new Save buttons pop up. Those will allow you save the changes you made which is different than what you used to do which was to publish them to the Data Factory.
- You’ll now have Saved, Save All and Publish Buttons. Additionally, you’ll be asked what branch you want. You could create a new one, but I choose the existing (master) branch.
- At the top you’ll see it noted that you’re working out of the master branch and Azure DevOps GIT. If you go there and try to go back and select Azure Data Factory, you’ll get a warning, publishing in Data Factory mode has been disabled, as we chose the DevOps GIT as our branch in this case.
- Next we want to create a new branch and it suggests I create that under my name. This can be good so when you’re working with others, they’ll see your branch name and know what you’re working on. I’ll start here but you’ll see my make a change here in a bit.
- In my demo, I’ll add in a wait command, so we can see how the change gets captured. In Get Number of Rows, I’ll choose to make this wait happen when we get a failure, then I connect the failure to the wait and click Save.
- When I hit Publish, I get an error message that says ‘publish is only allowed from collaboration (master) branch. Merge the changes to master.’
- To merge my changes to master, I go up to where is says tpantazi branch and change it to master branch. But when we go here my wait command disappears and I want to see that wait occur.
- I want that wait so to fix this, I go back up to the branch at the top and from drop-down I select Create Pull Request. This will pull that branch back into our collaboration or master branch.
- This opens a new window back in Azure DevOps and it will set it up with a pull request. So, we set up the pull request from tpantazi into master and click Create and it will pop up for me to either approve or complete the pull request and merge.
- Back in Data Factory, once we refresh, we’ll see that wait command come back into our master branch.
Now you know how to connect an Azure Data Factory to an Azure DevOps repository. Preserving the code from a working version while you’re making modifications is a great reason to use a DevOps repository.
Looking to learn more Azure? We've got the conference for you! Azure Data Week is coming to you in October - the only virtual conference 100% dedicated to Azure topics. With 4 jam packed days, eight 1-hour sessions each day that you can pick and choose from, plus access to all the recordings for one year all for only $49! Click the link below to learn more and register for this incredible event!
Sign-up now and get instant access
Leave a comment