Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Saturday 22 January 2022

Data Toboggan this week

Data Toboggan is a cloud born conference with the aim to provide speciality training in Azure Synapse Analytics believing that enabling that slope to advanced analytics is paramount in the current world of data. As with snowfall it brings about the need to adapt rapidly to change. 

Some things to consider for the new year to kick start your learning plan.

Our third conference is this week. Please sign up, share and help make this the best conference yet for our amazing speakers giving their time to help us all grow and expand our knowledge.

We look forward to welcoming you to our event

Sunday 16 January 2022

Data Toboggan Piste Map

The Data Toboggan agenda is out. Take a journey with us down the slope that enables predictive analytics on 29 January.  We have lots of amazing speakers for a full 12 hours of free content specialising on Azure Synapse. We look forward to seeing you there.

Register: https://bit.ly/DT22Register

Agenda: https://bit.ly/DT2022Agenda

Wednesday 12 January 2022

Data Governance with Azure Purview

SQLBits 8-12 March 2022 is approaching soon. I am really excited to be a part of a session with Erwin de Kreuk and Wolfgang Strasser. The session: Data Governance with Azure Purview - Ask the Experts

To submit your questions in advance for the session we have a Microsoft form to complete.

Sunday 2 January 2022

Creating an AI Ethics panel

A data ethics council helps maintain an organization’s values-based intentions, and increases transparency into how we use Data & AI. This enables a focus on three areas for growth:

  • Establish governance for data ethics & AI and consider the importance for data collection and sharing.
  • Describe how/when fairness happens and how/what biases have been accounted for
  • Provide mechanisms for recourse.

Harvard business review has an article Create an Ethics Committee to Keep Your AI Initiative in Check

Accenture have a summary page Building data and AI ethics Committee in your business

Accenture have a full report on building a data ethics committee. 

Ethics Frameworks to help implementation

Four frameworks are covered below, the UK government data ethics framework, Data Ethics Decision Aid (DEDA), the UK statistics authority ethics self-assessment tool and Microsoft Responsible AI Framework. 

The government data ethics framework  has a self-assessment for the 3 overarching principles.

The self-assessment has 5 specific actions. A score lower than 3 requires review.

The areas covered in each area:

1 define public benefit and user need
  • understand unintended and or negative consequences 
  • human rights considerations
  • justify the benefit
  • make user needs and public benefit transparent
  • check everyone understand user need and how to use the data

2 involve diverse expertise
  • ensure diversity within your team
  • involve external stakeholders
  • effective governance structures with experts
  • transparency

3 comply with the law
  • compliance with GDPR and DPA 2018
  • data protection by design
  • accountability
  • transparency
  • project complainant with the equality act 2010
  • ensure effective governance of your data

4 review the quality and limitations of the data
  • data source being used
  • meta data understood
  • processes to maintain integrity
  • is synthetic data appropriate for the project evidence based caveats
  • bias in data to train the model
  • determine proportionality
  • data anonymisation
  • robust practices - demonstrated reproducibility, quality of the model
  • make data open and shareable whenever possible
  • think about transparency of sensitive models
  • explainability

5 Evaluated and consider wider policy implications
  • repeatability
  • project influences
  • accountability structures
  • skills, training and maintenance for longevity of the project
  • share learnings

The 3 areas summary

Another model is the Data Ethics Decision Aid. DEDA is a tool-kit facilitating initial brainstorming sessions to map ethical issues in data projects, documenting the deliberation process and furthering accountability towards the various stakeholders and the public.

Ethics Self-Assessment Tool from the UK statistics authority is another framework .The self-assessment tool provides a timely means to identify ethical issues, shape future discussions and support an accurate and consistent estimation of the “ethical risks” of research proposals.

The Microsoft Responsible AI Framework looks at 6 principles
Fairness - should treat all people fairly
Reliability & Safety -  should perform reliably and safely
Privacy & Security - should be secure and respect privacy
Inclusiveness - should empower everyone and engage people
Transparency - should be understandable
Accountability - People should be accountable for AI systems

Saturday 1 January 2022

Data as an asset

I just read an interesting article about the recipe for success handling data assets. It talks about data as an essential factor for business agility and that it enables competitive advantage. Data is an asset in its own right and organizations must change how data is viewed at a strategic level.  Gartner and Accenture talk about data as the essential focus and ingredient.   This assets become valuable once actionable insight can be derived. The article sets out 15 mantras for implementing data as an asset

  1. Define your Data Strategy with tangible measurable metrics linked to business outcomes with a data architecture blueprint and executable roadmap.
  2. Disrupt business models with AI
  3. Establish the right Data Culture and Architecture with accountability, data curation and data quality competency, frictionless trusted data supply with embedded data fluency across the business and a data taxonomy and dictionary.
  4. Implement DataOps to infuse life into your data with data acquisition and management connecting data creators with data consumers.
  5. Establish Tech Intensity initiatives for Data-Fluency enablement by setting baselines for data literacy skills resulting in data fluency
  6. Establish Data Signals and Patterns Repository
  7. Establish Data Marketplace – for Data sharing and sourcing across ecosystems reviewing the data supply chain and data monetization strategy
  8. Use AI and ML Algorithms
  9. Democratize Data – Secure Data Access and the correct type of BI and BI tools and make data visualization more transparent, intuitive and contextualised.
  10. Data Governance to produce trusted data with data lineage, managed data quality, business meta data and data profiling with risk and privacy policies for compliance.
  11. Establish Data Ethics principles covering things such as transparency, traceability and explainability
  12. Data Observability being the understanding of the health of the data in the system. The data observability pillars freshness and velocity, distribution, volume, schemes, lineage and data security & compliance.
  13. Define Data security and compliance controls
  14. Hire the right Data Engineering and AI talent
  15. Establish a Chief Data Officer and office of CDO

The article finishes stating Data Fluency and empowerment will be the determining success factors in a data-literate world.