Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Monday, 4 October 2021

Let innovation drive world Azure Synapse Day 2021

Want some more information and background about our November Conference on 12 November 2021. One of our organisers has written a great post explaining it here.



You can read it here as well

So, while there’s been all this stuff going on, a couple of data friends and I started a conference series and a user group both focused on Azure Synapse Analytics. Because, you know.. you can never have too many online conferences, training sessions, and presentations, right ?

We’ve probably all attended online lerns recently, been to meetings online, chatted with friends and family over new and sometimes unfamiliar apps like Teams and Zoom (yes, the nerds have been using them for a while, but collab apps just went *wild* recently).

But we (as in me and the aforementioned data friends) wanted to do something different. Something more…. interactive. More…. conversational. Less “I’ll sign up and download the recordings later, maybe“. And so, somewhat concietedly, we came up with ‘World Azure Synapse Day 2021’, where we can share, chat, learn, and laugh (maybe) to round out the year. A sort of retrospective on the future, if you like.

Full disclosure: we didn’t come up with the format ourselves. A few months back, a company called DataStax did something similar for the worldwide Apache Cassandra community where they had short talks from the engineering team, the sales people, their marketing. We’re aiming for something waaaayyyy less corporate, and more about you 🙂 We just borrowed the good bits and left out the sales pitch.

So we’re doing three sessions on the day, timed to hopefully coincide with people’s availability in each of APAC, EMEA, and AMER time-zones. We hope it will feel a bit more like a meeting than a presentation, as there’ll be several short talks and interviews in each. The sessions will *not* be recorded (sorry, you’re going to have to be there to see it !) – it’s really important to us that each session flows and everyone can participate, and recording that doesn’t seem fair. We’re also not even publishing an agenda, just a schedule of who will be talking at each session – we want the speakers to speak for themselves in every way possible – so there’s no pressure on anyone to ‘get it right first time’, and each and every contribution is welcome.

If you want to attend (and we hope you do), please sign up to the MeetUp for Data Toboggan here. There’s also a code of conduct under the heading ‘Be Excellent To Each Other’ that we’d ask you to adhere to, but that’s all you have to do. If you want to have your voice heard, just submit some thoughts through Sessionize, and we’ll be in contact. And don’t hesitate to ask any question you like in the MeetUp group – we’ll make sure you get a response 🙂

Hope to see you at Data Toboggan – World Azure Synapse Day 2021 ! Until then, stay safe.

The Data Toboggan Team

Wednesday, 29 September 2021

Azure Purview Generally Available

The maximize the value of your data in the cloud: achieve unified data governance with Azure Purview digital event with Rohan Kumar, Corporate Vice President Azure Data and Mike Flasko, General Manager of Azure Data Governance Platform on 26 September 2021 was an exciting event.  It enables the new reimagined agile data governance world to move forward.












The event explored Azure Purview, a unified data governance solution that gives you a holistic, up-to-date map of your entire data estate. The general availability of Azure Purview was announced bringing that improve automated governance to the fore. 


The product launched with an area in preview. 
There is automated data discovery using Purview data scan supporting hybrid data landscapes with classification and lineage. There are 200+ data classifiers with 35 data sources. The data map graph describes the data assets and relationships across the estate with fine gained access controls. The lineage feature is really important as it enables root cause analysis to happen and ensures data lineage is available via visualisation. Being able to search, browse and curate data enables you gain more understanding of your data. 

Business context with data is important and to have a business glossary enables relevant business terms to be connected. There is hierarchy support and an integrated approval workflow.

There are a large set of connectors that enable data scanning currently available and there is pubic preview of more data options for data scanning for Google Cloud, Erwin, Salesforce, IBM DB2 and Cassandra. Data scanning options coming soon are snowflake, SAP HANA, PostgresSQL, MongoDB and MySQL.

Azure Purview is growing in functionality all the time and Purview Data Insights is in preview. This looks at 
  • data asset distribution 
  • sensitive data
  • data scanning coverage 
  • business glossary utilisation
The insights capability is useful for CDOs to gain a high level picture of the data estate.

A glimpse of the exciting roadmap to come was also shared.  


The Azure Purview features are



There a lots of resources already available

Azure Purview overview 

Introduction to Azure Purview Microsoft Learn to start your learning journey.

Official Purview blog for updates

Billing for Azure Purview will start on 1st November 2021. Pricing 

Announcements are detailed here

Customer Stories shared

Microsoft Customer Story-Danish pump manufacturer develops sustainable water solutions with unified data governance from Azure Purview

Microsoft Customer Story-Heathrow boosts operational efficiency and improves decision making with Azure Purview

Microsoft Customer Story-illimity optimizes data governance and streamlines compliance with Azure Purview

Sunday, 26 September 2021

National AI Strategy

The National AI Strategy was released in September 2021. It is a 10 year plan to transform and reshape our society.

The 3 aims are  to

  • Invest and plan for the long-term needs of the AI ecosystem
  •  Support the transition to an AI-enabled economy
  •  Ensure the UK gets the national and international governance of AI technologies right 

The  document contains a roadmap and details of the pillars . It has 3 pillars

Pillar 1 Investing in the log term needs of the AI ecosystem

Pillar 2 Ensuring AI benefits all sectors and regions

Pillar 3 Governing AI effectively



Central Digital and Data Office (CDDO)

The CDDO has been created within the Cabinet Office to consolidate the core policy and strategy responsibilities for data foundations. They will work with partners to improve government’s use and reuse of data to support data-driven innovation across the public sector.

The UK's National AI Strategy



Microsoft Research Summit 2021

 












The Microsoft Research Summit is open to everyone! October 19 - 21, with over 150 sessions across 16 tracks, provides the global research community with an opportunity learn from experts pushing the frontiers of technology. Register now: https://aka.ms/AAdv93n The event will start in three broadcast regions (China Standard Time, British Summer Time, and Pacific Time). Microsoft say

For 30 years, our research community at Microsoft has worked across disciplines, institutions, and geographies to envision and realize the promise of new technologies for Microsoft and for society. Today, we’re inviting the global science and technology community to continue this exploration—because ensuring that future advancements benefit everyone is up to all of us.

Join us at the inaugural Microsoft Research Summit, streaming virtually across three time zones. You’ll have the opportunity to hear from science and technology leaders from around the world—people who are driving advances across the sciences and pushing the limits of technology toward achieving a meaningful impact on humanity.

They want to build a place where research thinks of sustainability, ethics, diversity and is inclusive of everyone. There are some really interesting topics under discussion. 

Friday, 24 September 2021

Data Toboggan World Azure Synapse Day

We are back for another edition of Data Toboggan, but we'd like to do something a bit different this time round.

In the last year, Azure Synapse has celebrated its second birthday, and we've all been busy doing awesome stuff with the capabilities of the platform and loving every minute.

No ? Doesn't sound like your experience ? Or maybe we're bang on the money and it's been awesome ? So, tell us about it !

We want *you* to tell us how Azure Synapse Analytics inspires you, empowers you, and how it accelerates your business analytics. Want to tell us of a not-so-epic experience instead ? Yep, come and do that. Tell us your fails and wins, and what you're hoping for in the coming 12 months of Azure Synapse...

So, this next edition of Data Toboggan won't be a full-day, tech-focussed event.

Instead, we're planning on doing three 1 to 2 hour community sessions where we listen to you tell your story in a more open format.

We're not looking for long sessions - ideally each speaker would get 10 to 15 minutes to share and discuss. We're targeting Friday 12th November as the delivery date, with 3 sessions running at locally convenient times for APAC, EMEA and AMER (times are GMT, because that's where we are, and we hope they translate to 'convenient').

APAC - 08:00 GMT

EMEA - 12:00 GMT

AMER - 17:00 GMT

Look out for submission links heading your way soon ! You can submit to talk at any of the sessions - whichever is most convenient for you.

Thanks for being with us this year. We hope you'll enjoy sticking around for a while.

Stay safe,

The Data Toboggan Team. 

 

Thursday, 23 September 2021

Big Data LDN 2021 - Data Governance and the essential CDO

Big Data LDN has been running 22 -23 September. There have been some really interesting sessions covering data governance and the CDO role.

Robin Sutara spoke on the analytics paradox - balancing competing demands for agility and governance. Think big, one technology is not a magic bullet and learn to fail fast. 










Start small with one business problem and build upon that. The data quality with adapt and improve to meet needs of each problem being addressed. It is people and culture that change a business. We need to think beyond the technology. 

 

Great session and an important thought for the future to stop talking architecture and focus on business value. It goes beyond economics and needs sustainability.

Data is a hot mess, so lets cook session from Ben Schein mentioned a key point Data Governance is never done and always be open to new ideas. Data governance is a team sport. You need to find a balance between innovation and experimentation, data teams and consistency, usability and scale. He finished with some important areas to consider

  • keep data definitions consistent
  • don't wait for  perfect data set to arrive
  • data quality is essential for good insights
  • let people go shopping for the data they needs
  • there is no data magic wand
  • data governance is a team sport

Another very insightful session from Cindi Howson on data-driven culture: is your organisation a laggard or leader, gave perspectives on the biggest challenges to being data driven. 

Perspectives on the biggest challenge to being data driven is culture (67%) followed by talent and people (22%).
 
To disrupt your culture bring in a change agent, identify relevance (WIIFM) and organise for collaboration.

She talked about CDOs having a short tenure in organizations being an average of 2.5 years. There is an interesting article about that , Why Do Chief Data Officers Have Such Short Tenures? by Tom Davenport, Randy Bean, and Josh King 
















Why WIIFM (what's in it for me) for communication, incentives, skills and tribes and role models.

To move forward into this new world there needs to be a transition in organizational design, from that traditional BI centre of  excellence to this embedded design .



Centralize for economies of scale for common data, infrastructure and specialist talent but decentralize when business domain experts are essential to analytics workflow. Ensure that cross boundary communication is ongoing for best practice, synergies and career management. We are looking at a new hybrid model that is transparent and optimized.

She finished with sharing there is a Data Chief Community (TheDataChief.com) . It has podcasts, blog,  roundtables and community newsletter. 



What we learned from 400 data leaders in CDO summer school  session by Carruthers and Jackson highlighted
  • Data strategy is multi dimensional not linear
  • Governance is being revised but you need to tell the story. It is still a problem but it needs communicating the purpose through strategy.
  • They had Scott Taylor talk about data storytelling - keep simple
  • Risk - listen, listen and listen more 
  • Soft skills are important
  • Data Literacy is a spectrum.
  • Nurture your community
  • Play to your strengths and culture is the biggest hurdle
  • Keep policy simple
  • There should be a call to arms on Ethics
  • Methods can be disrupted, innovative and evoke change
  • It is not about technology outcomes

In the summer school they discussed the DIKW Pyramid which represents the relationships between data, information, knowledge and wisdom



Exasol have published the journey to the CDO. 











Tuesday, 14 September 2021

Data Governance Podcast

 

It was amazing to take part in my first podcast about the benefits of Data Governance and how to get started in Coeo Conversations with Justin Langford.  

The data field is such an exciting place to be at the moment. Data Governance is more than just compliance, it is about managing the whole ecosystem. I thought I would run out of things to say talking for 30 minutes in the podcast on governance, however that was not the case. I hope you find the podcast interesting, informative and fun.

Looking forward there is an exciting digital event coming up Maximize the Value of Your Data in the Cloud: Achieve unified data governance with Azure Purview . The event is Tuesday, September 28, 2021 | 9:00 AM-10:00 AM Pacific Time (UTC-7) Register here

Monday, 13 September 2021

Data Platform Virtual Summit 2021 Keynote


The Data Platform Summit has started. An excellent session covering the Azure data stack identifying when to use each tool and the key innovations with SQL. It was nice to see an explanation of when to use what tool. With so many tools now it becomes hard to know which is the best choice.  This useful chart was shown.



Azure SQL Edge



SQL Server 2019

Solves the modern data challenges 
  • Data visualization and big data clusters
  • Modern platforms with compatibility
  • Built-in machine learning and extensibility
  • Intelligent performance
  • Layers of security and complain
  • Business critical availability

SQL Linux/Container

Containers are portable and can run anywhere containers are supported. They are lightweight with reduced disk, CPU and memory footprint. They have a consistent image of SQL Server, scripts and tools and are efficient with faster deployment, no patching required and less downtime. 

Azure SQL

SQL Server vs. Azure SQL PasS are 
  • business continuity, high availability, automated backups, long term backup retention , geo-replication
  • scale, advanced security, version-less, built in monitoring and built-in intelligence. 

There are Azure SQL Editions for general purpose, business critical and hyperscale



Azure Arc
Bring Azure data services to on-premises, multi cloud and edge with Azure Arc. Azure Arc enabled SQL Managed Instance has many advantages.




The value of the Cloud provides additional tools such as

  • Azure Defender to protect your data
  • Azure SQL Database Ledger for blockchain
  • Telemetry across all your assets with Azure Monitor SQL Insights

Then to help with migration there are tools available

  • Azure Migrate to discover and assess your SQL Server assets
  • Migrate Inline with Azure Data Studio
  • Migrate online with Azure Database Migration Service or Log Replay Service

A few tools were not mentioned that form part of the data suite such as Azure Cosmos DB. The expansion of tools and options has grown significantly over the last few years so it is always good to assess what business objective you are trying to achieve and select the right tool.
 

Friday, 3 September 2021

Data Strategy: where are we and what is the answer to the ultimate question

Originally published here

Do we know where we are going? Have we asked the right questions? Without a roadmap, we will not arrive at our destination. The first step relies on discovering where we are, what we need to be successful and where we need to go. We need to create a roadmap to enable a path forward. With that roadmap, there is a need to assign owners of tasks throughout the data journey. Data Strategy is a top-down approach closely aligned with business strategy.

Gartner define a 'Data Strategy' as a highly dynamic process employed to support the acquisition, organization, analysis, and delivery of data in support of business objectives. Whereas DAMA defines Data Management as The development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles.

Many organisations do not have data strategies in place, although they may be working on areas that would sit under that umbrella. Deciding on what the core data principles are, can help an organisation quickly adapt to the data-centric culture.


As an example a set of data principles could be: 

  • All data is owned, managed, secured, and governed
  • Data is managed throughout its lifecycle
  • Data is available and visible whenever needed
  • Information is an asset
  • Use a data catalogue for visibility
  • Data is fit for purpose and meets the business need
  • There is a single version of the truth
  • Data skills training for people to use data effectively
  • Data ethical standards are followed

 

Before any type of strategy is created a business key stakeholder must champion the idea and a person identified to own the strategy, such as the CDO. The data strategy should be maintained and enacted through the data governance team and other working groups.

Data strategy is a framework that is built around the data to amalgamate the assets to create a source of trusted data to allow process efficiencies, increase confidence in the data and create opportunities for innovation. The Data Management strategy could be aligned with the Data Management Association and the DAMA Body of Knowledge (DMBOK), to enable consistent practices and verifiable decision making.

It is important to have an agile data strategy, thus creating a short-term strategy, so the immediate benefit can be gained by the business. Then working on a longer-term target strategy, once the gap analysis is complete and strategic imperatives are identified.​ A couple of core areas to also review are data governance, data ethics alongside data culture and data skills. The technology side for data collection, data storage, data processing and data output may need updating, but if a need exists for technological change, it will be due to the alignment of business and data strategies and identification throughout the process. To enable that agile approach to data strategy using a Boston matrix with the MoSCoW prioritization technique is very successful.

image003

 

DAMA lists deliverables from strategic planning as:

  • Data Management Charter (Vision, business objectives, guiding principles. success measures, risks, operating mode. A business plan to use the information to create competitive advantage and to support enterprise goals).
  • Data management scope statement (goals and objective for planning, organisation roles, responsibilities clarified)
  • Data Management Implementations Roadmap (programs, projects and tasks, road map and milestones). Requires a data management program strategy a plan for maintaining and improving the quality data integrity access and security and mitigating risks.

Taking all of this into account using systems thinking to gain that holistic view there are three areas that should be covered for success: business data strategy, IT data strategy and operational data strategy.

If you haven’t started creating a data strategy or already have one, it is worth reviewing the current state to ensure an agile actionable plan is in place for continuous improvement.

Friday, 20 August 2021

Maximize the value of your data with Azure Purview

There is a digital event coming on 28 September 2021 9-10am pacific time. The event is about achieving unified data governance with Azure Purview.
Join Microsoft Corporate Vice President Rohan Kumar at this free digital event for demos and deep dives.
Register now to:
  • Learn to create a comprehensive, automated map of all your data.
  • See how Azure Purview works with Azure Synapse Analytics, Power BI, and the rest of your data estate to deliver timely, reliable insights.
  • Watch in-depth demos of product features including Azure Purview Data Map and Data Catalog.
  • Ask Azure experts your data governance questions in the live Q&A.


 

Thursday, 19 August 2021

Azure Purview August updates

There have been some exciting changes to Azure Purview announced. These changes relate to charging and permissions. 

Elastic data maps

The data map is the foundation for data discovery. The data map has two components the throughput created by CRUD operations and storage of the metadata.

The data map can now grow elastically starting at one capacity unit. A capacity unit includes a throughput of 25 operations/sec and the metadata storage scales in increments of 2 GB. Purview Data Map can automatically scale up and down within limits . This new charging model makes Purview much more user friendly and less costly to set up and run.









Access Control in Azure Purview

Note: only applies for Purview accounts created on or after 18 August 18 2021,

A collection is a tool to group assets, sources, and other artifacts into a hierarchy for discoverability and to manage access control. Collections are used to organise and manage assets.

The are various roles that exist:

  • Collection admins - can edit Purview collections, their details, and add sub collections. They can also add users into other Purview roles on collections where they're admins.
  • Data source admins - can manage data sources and data scans.
  • Data curators - can create, read, modify, and delete catalog data assets and set up relationships between assets.
  • Data readers - can access but not modify the data.













There is a great video to watch from W Strasser explaining this new data plane RBAC (role based access control) catalog permission. Being able to fine tune the access to the collections brings with it great advantages. Currently collection names can't be updated or deleted. 

Monday, 9 August 2021

A Summer Retrospective: a bygone era


A few weeks in rural France is just the place to contemplate life and take a step back into a bygone era. An era where there are no phones, no internet and no television. Life can quite easily pass you by and you could go for weeks not speaking to a sole. The fruit on the trees ripen in the orchard, the birds waiting for the perfect moment to swoop and eat the fruit. The roads are mostly empty with the occasional car or logging lorry passing by. Cycling is heaven with the roads to yourself.

This rural area, 214 million years ago, had all life within 300 miles of Rochechouart wiped out when a meteorite, around one of the 15 largest ever to come crashing down on earth. The geological signs of he creator are still present today. This bygone era was also rife with conflict. From the last battle of Richard the 1st - the Lionheart, who laid siege to the Chateau of Chalus-Chabrol, located at the border between Aquitaine and the French kingdom, to the hideouts in the forest of the Maquis du Limousin, who were one of the largest groups of French resistance fighters in the Second World War. The village of Oradour-sur-Glane remains an empty ruin as a memorial for the massacre of its inhabitants.

In this backdrop, technology seems a lifetime away. I can't stress enough the tremendous benefits of taking a technological break for your mental health. You can dream and innovative without the interruption of everyday life.

The age of cloud computing, big data and the algorithm requires a 360-degree perspective. A socio-technical perspective is critical. Reflecting on the changes to the earth, made to this unique landscape from space, you realize that data is in the environment. It is not possible to be an expert in all areas as data is the environment. Data is history, is in the maps, is used in conflict resolution and is used for impact analysis. Data is completely inseparable from life and it drives life, not only business. The  choice of tools available help you navigate through data are vast.

The question is can one truly ever master the entirety of life. Data is life, the past, the present and the future. To truly be a master it requires collaboration, communication and control, as data weaves its interconnected complexity throughout life. A holistic view of this diverse scientific area is required to provide a sustainable future. There is no one best practice that can help navigate this web of graph vertices and edges.

To that end I summise that taking a technological break enables the mind to contemplate and blue sky thinking roam free. Happy Summer break. 

Thursday, 5 August 2021

Data Conferences


I have been surprised by the number of data conferences there are and the continuing growth and diversity of the topics and formats. As well as the main and specialist conferences, a few of which are listed below,  there are a huge number of training events that take place in the evenings and weekends such as  SQL Saturdays ,  Azure Data Community and Data Saturdays . With all these events providing learning opportunities ,and many for free, we are very lucky that we have a community that is so will to share their experience.

Conferences

Date

URL

Dativerse

13-Aug-21

https://datagrillen.com/dativerse/

Data Platform Virtual Summit 

13-18 Sept 2021

https://dataplatformgeeks.com/dps2021/

Future Data Driven 

29-Sep-21

https://datadrivencommunity.com/

DataMinds Connect  

11-12 Oct 2021

https://datamindsconnect.be/

New Stars of Data  

22-Oct-2021

https://www.newstarsofdata.com/

Data Weekender 

06-Nov-2021

https://www.dataweekender.com/

PASS Data Community Summit

8-12 Nov 2021

https://passdatacommunitysummit.com/

The SQL Server & Azure SQL Conference 

7-9 Dec 2021

https://www.mssqlconf.com/#!/

Data & AI Summit 

27-30 June 2022

https://databricks.com/dataaisummit

Big Data LDN

22-23 Sept 2021

https://bigdataldn.com/

SQLBits 

2022

https://sqlbits.com/

Azure Cosmos DB Conf

20-21 April 2021

https://gotcosmos.com/conf

Power BI Summit

7-11 March 2022

https://globalpowerbisummit.com/

Data Toboggan

12-Jun-2021

http://www.datatoboggan.co.uk/

Microsoft Build 2021

25-27 May 2021

https://mybuild.microsoft.com/home

Microsoft Ignite 2021

2-4 Nov 2021

https://myignite.microsoft.com/home

Microsoft Inspire 2021

14-Jul-2021

https://myinspire.microsoft.com/home

DataMinutes

22-Jan-2022

https://datagrillen.com/dataminutes/


Monday, 2 August 2021

Responsible Innovation: A Best Practices Toolkit

Responsible innovation is a toolkit that helps developers become good stewards for the future of science and its effect on society.  

There are 3 areas
  • Judgment Call
  • Harms Modelling
  • Community Jury

This toolkit provides a set of practices currently in development, for anticipating and addressing the potential negative impacts of technology on people. This is an early release of this development.

Judgment Call 

Judgment Call is an award-winning game and team-based activity that puts Microsoft’s AI principles of fairness, privacy and security, reliability and safety, transparency, inclusion, and accountability into action. The game cultivates stakeholder empathy through scenario-imagining. Game participants write product reviews from the perspective of a particular stakeholder, describing what kind of impact and harms the technology could produce from their point of view.

To prepare for this game, download the printable Judgment Call game kit.























Harms Modelling 

Harms Modelling is a framework for product teams, grounded in four core pillars of responsible innovation, that examine how people's lives can be negatively impacted by technology: injuries, denial of consequential services, infringement on human rights, and erosion of democratic & societal structures. Similar to Security Threat Modelling, This modelling enables product teams to anticipate potential real-world impacts of technology.



Community Jury

Community Jury is a technique that brings together diverse stakeholders impacted by a technology. It is an adaptation of the citizen jury. The stakeholders are provided an opportunity to learn from experts about a project, deliberate together, and give feedback on use cases and product design. This responsible innovation technique allows project teams to collaborate with researchers to identify stakeholder values, and understand the perceptions and concerns of impacted stakeholders.

These 3  new tools under development are underdevelopment but quiet interesting to look at. 

References

Citizens Juries

The Ethics of AI Ethics: An Evaluation of Guidelines

Hagendorff, T. The Ethics of AI Ethics: An Evaluation of Guidelines. Minds & Machines 30, 99–120 (2020) 

Wednesday, 28 July 2021

Data Governance: An Introduction

Initially published on the Coeo blog.  

Data Governance is a core area that businesses need to adopt in the data-driven world. Data has been around since the earliest of times, from the first libraries in the ancient world that started to collect and store information.

The collection of scientific research information, from census information about human populations, weather and spatial data to DNA genetic data, have all been contributing to the need to store data for analysis. The breadth of the information that is available for analysis covers our entire planet and beyond, and the population as well as different species. With our life and environment becoming documented to the finest degree the need for categorisation, data labelling and data management has become engrained into our society. Where research led the way for documentation of classification for data, business is now at a crucial time of growth and expansion to enable innovation.

With all data there becomes a continual need for its management and a core starting place is data governance. The DAMA Dictionary of Data Management defines Data Governance as “The exercise of authority, control and shared decision making (planning, monitoring and enforcement) over the management of data assets".

The goal of data governance is to help an organisation to manage data as an asset efficiently and effectively. It provides the principles, policy, processes, framework, metrics and oversight that are required to drive the most business value. Data governance programs have a goal of creating sustainable data management, good data quality that is measured and defining policies and practices. A much-needed area that needs to be considered is that of culture and embedding that culture of data management into the business.

We start with understanding what data assets a business has from the core known data and dark data; data that is collected but not used. The proliferation of duplicate data around a business is key to document. Often the first thing that comes to mind with data governance these days is compliance with all the data breaches that keep occurring. The areas one thinks of here are:  

  • Policies
  • Transparency
  • Governance
  • Regulations, such as GDPR
  • Standards
  • Rules
  • Law

These require data inventories and audits to understand what personal data your organisation collects, where it is stored, how it is protected and who may have access to it.​ This is part of the picture that needs to be considered.

DAMA-DMBOK is an international guiding framework for the management of data. The framework includes areas such as:

  • Data Strategy – defining, communicating and driving execution​.
  • Policy – metadata management, access, usage, security, quality
  • Standards and quality – data architecture and data quality standards
  • Oversight/audit/stewardship
  • Compliance
  • Data issue management – compliance, ownership, policy, terminology, data quality, data access
  • Data management improvement projects 
  • Data asset valuation constantly define business value of data assets.

Consideration for the allocation of roles and responsibilities within an operating model helps guide the adoption of best practices.

In conclusion, managing data assets within a business requires it to be embedded in the culture of an organisation. Having high quality data leads to better business decisions. Having a core oversight function that is provided by a Chief Data Officer helps with keeping the day to day running of data in the fore front of everyone’s minds and you never know where the next innovation will come from.

More Information