Microsoft Purview is rapidly changing in the data governance space. It is offering Data value creation with essential defense & response offense . This new addition helps business address the issues that the AI outputs are only as good as the quality of the data that resides behind it.
Peter Aiken new definition of data governance ' Managing data
decisions with guidance’.
Suma Manohar has written a great article talking about data
quality in the era of AI. Microsoft purview
introduced domain and data products adding that clear business context and
terminology mapping. Enhanced search
capability to provide more understanding using Copilot is available. It also
can help with suggesting Data Quality rules. These autogenerated rules are context
specific.
Creating data
quality rules manually in Purview should follow the 6 standard data quality
metrics.
- Freshness – confirms that all values are up to date.
- Duplicate rows- checks rows to find repeated values across two or more columns.
- Empty/blank files – looks for blank and empty fields in a column where there should be values.
- Unique values – confirms that values in a column are unique.
- Data type match – confirms that values in a column match data type requirements.
- String format match – confirms that text values in a column match a specific format or other requirements.
- Table lookup – confirms that a value in one table can be found in a specific column of another table
- Custom – create a custom rule with the visual expression builder.
- Regular expressions can be used for pattern matching in the above.
When working on data quality there are standard guidelines that
can help. A method I use is firstly from the DAMA-DMBOK and then the Data
Management Capability Assessment Model (DCAM)
Scans take place to show quality score and trends in the data quality dashboard and
scores are shown on the data product page
The rollout of the new solution across the regions is shared
here.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.