Welcome back to part 2 of my mini-series on data governance. If you missed my first post in this series, click here. In part 2, I’ll cover the topics of master data management, data quality and security within data governance.
Master Data Management
- Master Data Management (MDM) is a component of data governance within the modern data warehouse approach.
- For data to be trusted, it must be accurate. We need accurate data to report on to manage the business and grow sales, reduce cost, and streamline processes.
- MDM is a framework or methodology used to define, manage, and store the data in a logical way to have a single, trusted point of reference.
- It streamlines data sharing among personnel and departments.
- The benefits of master data management are many, and include:
- Removing duplicates
- Standardizing data
- Incorporating rules to eliminate incorrect data
- Create an authoritative source of master data
- In the Microsoft world, we have Master Data Services (MDS). This is a great tool and is comprised of a stand alone web server plus a database for storing the master records. It allows users to connect through Excel or the web portal, and gives them a single repository to store, modify and add data as needed.
- MDS can be configured to manage any domain (products, customers, accounts) and includes hierarchies (country, state, city, zip code for instance or custom hierarchies based on the organization), granular security, data versioning (like audit trails) and business rules.
- Within MDS, developers can create models that can contain multiple entities. An entity contains a set of data that can be maintained and supported by the Data Steward. This offloads much of the detailed maintenance work from the BI developers to the Data Steward. The Data Steward will own the process, so the BI developer can focus on more technology related tasks.
- The Data Steward can manage changes using Excel and all changes are tracked in the underlying database for an audit trail.
Data Quality Services (DQS)
- DQS is a data quality product that allows you to build a knowledge base and use it to perform a variety of critical data quality tasks such as data correction, enrichment, standardization, and de-duplication.
- It also enables you to perform data cleansing, plus provides profiling integrated into its data quality tasks, so you can analyze the integrity of your data.
- Another component are data dictionaries where we can store tables, fields, field types and descriptions. This information describes and defines the data and that data is viewable in a Power BI report.
- In Azure we have the Azure Data Catalog which is a fully managed cloud service. Azure Data Catalog lets users discover the data sources they need and understand the data sources they find. It also helps organizations get more value from their existing investments, and it allows any user (analyst, data scientist or developer) to discover, understand and consume data sources. And most important, it creates a place for a single source of the truth.
Security
- Data must be secured to become an asset. Data lives in multiple places within an organization so it’s important to think through security of where and how to apply permissions, as well as your backups in the cloud. We must ensure the data is secure at each point along the trail from data warehouse processing, moving from one database to another and reporting.
- Microsoft offers a tremendous amount of security through multiple hops along the trail at different layers.
- Row Level Security is easy to implement and allows each report viewer to only see the data they are authorized to see.
- Dynamic Data Masking gives us the ability to mask sensitive data based on rules set per user, per table or per column.
- Firewall where we can secure the data by applying server level and database level rules using the Azure Portal, T-SQL, or PowerShell.
Other things you should think about with security are, migration of code deployments – moving code from Dev to QA to Production; consistent naming conventions; audits, alerts, and monitoring; and source control.
DevOps are really gaining traction in the industry. There is a strong correlation between DevOps adoption and implementing a data governance program. DevOps have a lot of potential in Azure for deploying continuous development and continuous integration. I believe DevOps will be a key feature implemented in any Azure project moving forward.
This wraps up my mini-series on data governance. A key point to note is that data governance is not a one-time project, but an ongoing process.
If you want to discuss data governance in more detail or have questions about implementing it in your organization or have question about Azure in general, we’re here to help. Click the link below to learn more about our cloud offerings or contact us to start a conversation today. Our team of experts are here to help you use your data and Azure to take your business from good to great.