Pragmatic Works Nerd News

Performance Techniques for SSIS in Azure Data Factory

Written by Bob Rubocki | Sep 28, 2018

If you’re new to using integration services within Azure Data Factory, you may notice at times it takes a bit longer for some of the packages to run than they would have on prem. Today I’ll share a couple simple and effective ways to help with the performance of those from experiences we’ve had.

First, you are going to want to look at your performance tier of your SSIS catalog database. The database does two things where the service tier will impact it. One is when the package starts, it needs to read the definition of that package from the database, so the higher the tier of your database, the quicker it will be able to read that definition out and start the package.

Secondly when it comes to the performance tier, the database manages all the logging. If you have a package with a lot of tasks and activities, it’s going to log a lot of that execution activity to the SSIS catalog. If you have a higher tier catalog, the logging operation will be faster. Increase the performance tier of your catalog and that will help with the performance of your package.

Looking at the performance tiers is typically the first thing we try as it’s simple to change, as well as relatively inexpensive to increase the performance tier of that database.

Another thing we do that we find to have even more impact is to increase the node size of your integration runtime. The node size is the size and power of the virtual machine(s) that are running your integration runtime. You do this by simply using the ‘node size option’ in your integration runtime (see screenshot below).

For example, if you were running on a D1 machine, you may want to try bumping up to a D2 or D4 or if you’re running an A4, maybe bump that up to an A8. So, if you’re running a single package and it’s running slow, we generally find much more impact on performance by cranking up the power of that VM.

These tips are for increasing the performance when running a single package. If we’re talking about running many packages in parallel, there are other options and considerations, possibly to be covered in another post.

I hope you found these quick pointers helpful. If you have questions about integration services, Azure Data Factory or how to best utilize the cloud in your organization, you’re in the right place. Click the link below or contact us – we’re here to help no matter where you are on your cloud journey.