Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Sunday, 16 January 2022

Data Toboggan Piste Map

The Data Toboggan agenda is out. Take a journey with us down the slope that enables predictive analytics on 29 January.  We have lots of amazing speakers for a full 12 hours of free content specialising on Azure Synapse. We look forward to seeing you there.

Register: https://bit.ly/DT22Register

Agenda: https://bit.ly/DT2022Agenda



Wednesday, 12 January 2022

Data Governance with Azure Purview

SQLBits 8-12 March 2022 is approaching soon. I am really excited to be a part of a session with Erwin de Kreuk and Wolfgang Strasser. The session: Data Governance with Azure Purview - Ask the Experts

To submit your questions in advance for the session we have a Microsoft form to complete.



Sunday, 2 January 2022

Creating an AI Ethics panel

A data ethics council helps maintain an organization’s values-based intentions, and increases transparency into how we use Data & AI. This enables a focus on three areas for growth:

  • Establish governance for data ethics & AI and consider the importance for data collection and sharing.
  • Describe how/when fairness happens and how/what biases have been accounted for
  • Provide mechanisms for recourse.

Harvard business review has an article Create an Ethics Committee to Keep Your AI Initiative in Check

Accenture have a summary page Building data and AI ethics Committee in your business

Accenture have a full report on building a data ethics committee. 

Ethics Frameworks to help implementation

Four frameworks are covered below, the UK government data ethics framework, Data Ethics Decision Aid (DEDA), the UK statistics authority ethics self-assessment tool and Microsoft Responsible AI Framework. 

The government data ethics framework  has a self-assessment for the 3 overarching principles.


The self-assessment has 5 specific actions. A score lower than 3 requires review.


The areas covered in each area:

1 define public benefit and user need
  • understand unintended and or negative consequences 
  • human rights considerations
  • justify the benefit
  • make user needs and public benefit transparent
  • check everyone understand user need and how to use the data

2 involve diverse expertise
  • ensure diversity within your team
  • involve external stakeholders
  • effective governance structures with experts
  • transparency

3 comply with the law
  • compliance with GDPR and DPA 2018
  • data protection by design
  • accountability
  • transparency
  • project complainant with the equality act 2010
  • ensure effective governance of your data

4 review the quality and limitations of the data
  • data source being used
  • meta data understood
  • processes to maintain integrity
  • is synthetic data appropriate for the project evidence based caveats
  • bias in data to train the model
  • determine proportionality
  • data anonymisation
  • robust practices - demonstrated reproducibility, quality of the model
  • make data open and shareable whenever possible
  • think about transparency of sensitive models
  • explainability

5 Evaluated and consider wider policy implications
  • repeatability
  • project influences
  • accountability structures
  • skills, training and maintenance for longevity of the project
  • share learnings

The 3 areas summary

Another model is the Data Ethics Decision Aid. DEDA is a tool-kit facilitating initial brainstorming sessions to map ethical issues in data projects, documenting the deliberation process and furthering accountability towards the various stakeholders and the public.



Ethics Self-Assessment Tool from the UK statistics authority is another framework .The self-assessment tool provides a timely means to identify ethical issues, shape future discussions and support an accurate and consistent estimation of the “ethical risks” of research proposals.

The Microsoft Responsible AI Framework looks at 6 principles
 
Fairness - should treat all people fairly
Reliability & Safety -  should perform reliably and safely
Privacy & Security - should be secure and respect privacy
Inclusiveness - should empower everyone and engage people
Transparency - should be understandable
Accountability - People should be accountable for AI systems

Saturday, 1 January 2022

Data as an asset

I just read an interesting article about the recipe for success handling data assets. It talks about data as an essential factor for business agility and that it enables competitive advantage. Data is an asset in its own right and organizations must change how data is viewed at a strategic level.  Gartner and Accenture talk about data as the essential focus and ingredient.   This assets become valuable once actionable insight can be derived. The article sets out 15 mantras for implementing data as an asset

  1. Define your Data Strategy with tangible measurable metrics linked to business outcomes with a data architecture blueprint and executable roadmap.
  2. Disrupt business models with AI
  3. Establish the right Data Culture and Architecture with accountability, data curation and data quality competency, frictionless trusted data supply with embedded data fluency across the business and a data taxonomy and dictionary.
  4. Implement DataOps to infuse life into your data with data acquisition and management connecting data creators with data consumers.
  5. Establish Tech Intensity initiatives for Data-Fluency enablement by setting baselines for data literacy skills resulting in data fluency
  6. Establish Data Signals and Patterns Repository
  7. Establish Data Marketplace – for Data sharing and sourcing across ecosystems reviewing the data supply chain and data monetization strategy
  8. Use AI and ML Algorithms
  9. Democratize Data – Secure Data Access and the correct type of BI and BI tools and make data visualization more transparent, intuitive and contextualised.
  10. Data Governance to produce trusted data with data lineage, managed data quality, business meta data and data profiling with risk and privacy policies for compliance.
  11. Establish Data Ethics principles covering things such as transparency, traceability and explainability
  12. Data Observability being the understanding of the health of the data in the system. The data observability pillars freshness and velocity, distribution, volume, schemes, lineage and data security & compliance.
  13. Define Data security and compliance controls
  14. Hire the right Data Engineering and AI talent
  15. Establish a Chief Data Officer and office of CDO

The article finishes stating Data Fluency and empowerment will be the determining success factors in a data-literate world.

Wednesday, 22 December 2021

Open Data Campaign

Microsoft talk about closing the data divide and the need for open data.  Thus helping remove barriers to data innovation. The data divide progress report on can be read here
Microsoft launched five data collaboration principles in connection with this. 

Open: We will work to make data relevant to important social problems as open as possible, including by contributing open data ourselves.

Usable: We will invest in creating new technologies and tools, governance mechanisms, and policies to make data more usable for everyone
 
Empowering: We will help organizations generate value from their data according to their choices and develop their AI talent to use data effectively and independently

Secure: We will employ security controls to ensure data collaboration is operationally secure where it is desired.

Private: We will help organizations protect individuals’ privacy in data sharing collaborations that involve personally identifiable information

Sunday, 19 December 2021

Data Toboggan - our year 2021

We have had a very busy first year and enjoyed it all. Thank you to all who made our conferences possible, organisers, speakers, attendees, our logo designer and to those who helped share our event.

We are back 29 January 2022. 

Call for Speakers https://bit.ly/DT22CFS 

Register https://bit.ly/DT22Register






Wednesday, 15 December 2021

Azure Purview Dataset provisioning by data owner for Azure Storage (preview)

 
Azure Purview Dataset provisioning by data owner for Azure Storage (preview)


To enable access policy enforcement for the Azure Storage account the following PowerShell command needs executing in the subscription where the Azure Storage account resides. It relates to all Azure Storage accounts in that subscription.

# Install the Az module
Install-Module -Name Az -Scope CurrentUser -Repository PSGallery -Force
# Login into the subscription
Connect-AzAccount -Subscription <SubscriptionID>
# Register the feature
Register-AzProviderFeature -FeatureName AllowPurviewPolicyEnforcement -ProviderNamespace Microsoft.Storage

Note: Only new Storage accounts, created in the subscription after the feature AllowPurviewPolicyEnforcement is registered, will comply with access policies published from Purview.