Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Thursday 25 April 2024

Data Governance, Data Ethics and Responsible AI video series

I wanted to be able to share some thoughts on 3 of my favourite topics, Data Governance, Data Ethics and Responsible AI. There are many tools that help frame the subject area, from a data management perspective and there are useful Microsoft Tools to help you down the responsible AI and Governance route. There is a wealth of information available and wanted to, in under 5 mins a video, empower people to quickly have useful tips to move forward in this important space.  So it is an easily digestible series that is time efficient, has standalone content with an overall theme.
  • Data Governance to help govern and manage that data to improve trust and data quality 
  • Data Ethics to help mitigate issues with data integrity and provenance
  • Responsible AI to look a bias, fairness and efficacy in decisions

Episode 1 Introduction

Episode 2 what is data governance

Episode 3 what is data ethics

Episode 4 What is Responsible AI

Episode 5 Responsible AI Tools Microsoft Standard v2

Episode 6 Responsible AI Tools Impact Assessment and guide

Episode 7 Responsible AI Tools HAX Toolkit

Episode 8 Responsible AI Tools Maturity Model

Episode 9 The EU Act

Episode 10 UK Government Assurance

Episode 11 Content Safety

Episode 12 Responsible AI Dashboard

Watch this space as the next set of videos will cover how this fits in with data quality and how Microsoft Purview can help with data preparation.

Wednesday 3 April 2024

Microsoft Purview Fabric announcements

There were a number of announcements at the Microsoft Fabric Community Conference including the new Microsoft Purview for modern data governance was shared.  With business moving towards federated governance models, managed by line of business to help with more local understanding and increasing volumes of data, Microsoft have launched in Purview the capability for organizations to create subdomains to refine the way the data estate is structured in Fabric. Security has also become easier with the ability to set security groups for default domains

Microsoft Fabric is now natively integrated with Microsoft Purview Data Governance solution. There is a reimagined data governance experience for the data estate governance practice. The new experience includes data curation, an important new feature including data quality with insights. The new experience is available in preview 8 April 2024. This new experience is aiming to help accelerate measurable business value with key results, simplification and to help with implementing efficiency with natural language recommendations. 

Purview enables business terminology linkage to 

  • Data Products (a collection of data assets used for a business function) 
  • Business Domains (ownership of Data Products) 
  • Data Quality (assessment of quality) 
  • Data Access, Actions 
  • Data Estate Health (reports and insights)

A really exciting new feature we have all been waiting for is the data quality capabilities.  The is now the Data Quality model to set rules top down with business domains, data products, and the data assets. The model generates data quality scores at the asset, data product, or business domain level from the policies on terms or rules.  The score rules show on the dashboard as red/yellow/green indicator scores. The 2 capabilities in this data quality model are:

  • Profiling—quick sample set insights 
  • Data quality scans—in-depth scans of full data sets

It is great to see the Microsoft Purview continues to align to the EDM Council set of 14 rules. 

There is now an actions centre showing the current health summarising actions by role, data product or business domain for governance. This actions centra aims to help improve governance posture for the business. 

There is partnership with Ernst & Young LLP who will share playbooks and reports for US financial services customers on Azure Marketplace, throughout the preview. 


In summary there is a shift away from traditional IT-centric data architecture to federated architectures such as data mesh. The automated way to deal with Data Quality is a game changer for business. 

References

Announcements from the Microsoft Fabric Community Conference

Easily implement data mesh architecture with domains in Fabric

Introducing modern data governance for the era of AI 

The foundation for responsible analytics with Microsoft Purview

Watch: The Unified Data Platform for the Era Of AI | Microsoft Fabric Community Conference Day 1 Keynote

Crash Course in Microsoft Purview (azureedge.net)

Learning

Monday 1 April 2024

Responsible AI dashboard training

There is a new MSLearn course to Learn how to debug an AI model using the Responsible AI dashboard in Azure Machine Learning studio to ensure it performs responsibly and is less harmful. It is important to understand and learn how to use the dashboard to set any projects up for success.

Train a model and debug it with Responsible AI dashboard

The objectives are 

  • Create a responsible AI dashboard.
  • Identify where the model has errors.
  • Discover data over or under representation to mitigate biases.
  • Understand what drives a model outcome with explainable and interpretability.
  • Mitigate issues to meet compliance regulation requirements.

You do need the ability to understand beginner level Python.