Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Wednesday 29 September 2021

Azure Purview Generally Available

The maximize the value of your data in the cloud: achieve unified data governance with Azure Purview digital event with Rohan Kumar, Corporate Vice President Azure Data and Mike Flasko, General Manager of Azure Data Governance Platform on 26 September 2021 was an exciting event.  It enables the new reimagined agile data governance world to move forward.












The event explored Azure Purview, a unified data governance solution that gives you a holistic, up-to-date map of your entire data estate. The general availability of Azure Purview was announced bringing that improve automated governance to the fore. 


The product launched with an area in preview. 
There is automated data discovery using Purview data scan supporting hybrid data landscapes with classification and lineage. There are 200+ data classifiers with 35 data sources. The data map graph describes the data assets and relationships across the estate with fine gained access controls. The lineage feature is really important as it enables root cause analysis to happen and ensures data lineage is available via visualisation. Being able to search, browse and curate data enables you gain more understanding of your data. 

Business context with data is important and to have a business glossary enables relevant business terms to be connected. There is hierarchy support and an integrated approval workflow.

There are a large set of connectors that enable data scanning currently available and there is pubic preview of more data options for data scanning for Google Cloud, Erwin, Salesforce, IBM DB2 and Cassandra. Data scanning options coming soon are snowflake, SAP HANA, PostgresSQL, MongoDB and MySQL.

Azure Purview is growing in functionality all the time and Purview Data Insights is in preview. This looks at 
  • data asset distribution 
  • sensitive data
  • data scanning coverage 
  • business glossary utilisation
The insights capability is useful for CDOs to gain a high level picture of the data estate.

A glimpse of the exciting roadmap to come was also shared.  


The Azure Purview features are



There a lots of resources already available

Azure Purview overview 

Introduction to Azure Purview Microsoft Learn to start your learning journey.

Official Purview blog for updates

Billing for Azure Purview will start on 1st November 2021. Pricing 

Announcements are detailed here

Customer Stories shared

Microsoft Customer Story-Danish pump manufacturer develops sustainable water solutions with unified data governance from Azure Purview

Microsoft Customer Story-Heathrow boosts operational efficiency and improves decision making with Azure Purview

Microsoft Customer Story-illimity optimizes data governance and streamlines compliance with Azure Purview

Sunday 26 September 2021

National AI Strategy

The National AI Strategy was released in September 2021. It is a 10 year plan to transform and reshape our society.

The 3 aims are  to

  • Invest and plan for the long-term needs of the AI ecosystem
  •  Support the transition to an AI-enabled economy
  •  Ensure the UK gets the national and international governance of AI technologies right 

The  document contains a roadmap and details of the pillars . It has 3 pillars

Pillar 1 Investing in the log term needs of the AI ecosystem

Pillar 2 Ensuring AI benefits all sectors and regions

Pillar 3 Governing AI effectively



Central Digital and Data Office (CDDO)

The CDDO has been created within the Cabinet Office to consolidate the core policy and strategy responsibilities for data foundations. They will work with partners to improve government’s use and reuse of data to support data-driven innovation across the public sector.

The UK's National AI Strategy



Microsoft Research Summit 2021

 












The Microsoft Research Summit is open to everyone! October 19 - 21, with over 150 sessions across 16 tracks, provides the global research community with an opportunity learn from experts pushing the frontiers of technology. Register now: https://aka.ms/AAdv93n The event will start in three broadcast regions (China Standard Time, British Summer Time, and Pacific Time). Microsoft say

For 30 years, our research community at Microsoft has worked across disciplines, institutions, and geographies to envision and realize the promise of new technologies for Microsoft and for society. Today, we’re inviting the global science and technology community to continue this exploration—because ensuring that future advancements benefit everyone is up to all of us.

Join us at the inaugural Microsoft Research Summit, streaming virtually across three time zones. You’ll have the opportunity to hear from science and technology leaders from around the world—people who are driving advances across the sciences and pushing the limits of technology toward achieving a meaningful impact on humanity.

They want to build a place where research thinks of sustainability, ethics, diversity and is inclusive of everyone. There are some really interesting topics under discussion. 

Friday 24 September 2021

Data Toboggan World Azure Synapse Day

We are back for another edition of Data Toboggan, but we'd like to do something a bit different this time round.

In the last year, Azure Synapse has celebrated its second birthday, and we've all been busy doing awesome stuff with the capabilities of the platform and loving every minute. 

We want *you* to tell us how Azure Synapse Analytics inspires you, empowers you, and how it accelerates your business analytics. Tell us your stories, and what you're hoping for in the coming 12 months of Azure Synapse...

So, this next edition of Data Toboggan won't be a full-day, tech-focussed event.

Instead, we're planning on doing three 1 to 2 hour community sessions where we listen to you tell your story in a more open format.

We're not looking for long sessions - ideally each speaker would get 10 to 15 minutes to share and discuss. We're targeting Friday 12th November as the delivery date, with 3 sessions running at locally convenient times for APAC, EMEA and AMER (times are GMT, because that's where we are, and we hope they translate to 'convenient').

APAC - 08:00 GMT

EMEA - 12:00 GMT

AMER - 17:00 GMT

Look out for submission links heading your way soon ! You can submit to talk at any of the sessions - whichever is most convenient for you.

Thanks for being with us this year. We hope you'll enjoy sticking around for a while.

Stay safe,

The Data Toboggan Team. 

 

Thursday 23 September 2021

Big Data LDN 2021 - Data Governance and the essential CDO

Big Data LDN has been running 22 -23 September. There have been some really interesting sessions covering data governance and the CDO role.

Robin Sutara spoke on the analytics paradox - balancing competing demands for agility and governance. Think big, one technology is not a magic bullet and learn to fail fast. 










Start small with one business problem and build upon that. The data quality with adapt and improve to meet needs of each problem being addressed. It is people and culture that change a business. We need to think beyond the technology. 

 

Great session and an important thought for the future to stop talking architecture and focus on business value. It goes beyond economics and needs sustainability.

Data is a hot mess, so lets cook session from Ben Schein mentioned a key point Data Governance is never done and always be open to new ideas. Data governance is a team sport. You need to find a balance between innovation and experimentation, data teams and consistency, usability and scale. He finished with some important areas to consider

  • keep data definitions consistent
  • don't wait for  perfect data set to arrive
  • data quality is essential for good insights
  • let people go shopping for the data they needs
  • there is no data magic wand
  • data governance is a team sport

Another very insightful session from Cindi Howson on data-driven culture: is your organisation a laggard or leader, gave perspectives on the biggest challenges to being data driven. 

Perspectives on the biggest challenge to being data driven is culture (67%) followed by talent and people (22%).
 
To disrupt your culture bring in a change agent, identify relevance (WIIFM) and organise for collaboration.

She talked about CDOs having a short tenure in organizations being an average of 2.5 years. There is an interesting article about that , Why Do Chief Data Officers Have Such Short Tenures? by Tom Davenport, Randy Bean, and Josh King 
















Why WIIFM (what's in it for me) for communication, incentives, skills and tribes and role models.

To move forward into this new world there needs to be a transition in organizational design, from that traditional BI centre of  excellence to this embedded design .



Centralize for economies of scale for common data, infrastructure and specialist talent but decentralize when business domain experts are essential to analytics workflow. Ensure that cross boundary communication is ongoing for best practice, synergies and career management. We are looking at a new hybrid model that is transparent and optimized.

She finished with sharing there is a Data Chief Community (TheDataChief.com) . It has podcasts, blog,  roundtables and community newsletter. 



What we learned from 400 data leaders in CDO summer school  session by Carruthers and Jackson highlighted
  • Data strategy is multi dimensional not linear
  • Governance is being revised but you need to tell the story. It is still a problem but it needs communicating the purpose through strategy.
  • They had Scott Taylor talk about data storytelling - keep simple
  • Risk - listen, listen and listen more 
  • Soft skills are important
  • Data Literacy is a spectrum.
  • Nurture your community
  • Play to your strengths and culture is the biggest hurdle
  • Keep policy simple
  • There should be a call to arms on Ethics
  • Methods can be disrupted, innovative and evoke change
  • It is not about technology outcomes

In the summer school they discussed the DIKW Pyramid which represents the relationships between data, information, knowledge and wisdom



Exasol have published the journey to the CDO. 











Tuesday 14 September 2021

Data Governance Podcast

 

It was amazing to take part in my first podcast about the benefits of Data Governance and how to get started in Coeo Conversations with Justin Langford.  

The data field is such an exciting place to be at the moment. Data Governance is more than just compliance, it is about managing the whole ecosystem. I thought I would run out of things to say talking for 30 minutes in the podcast on governance, however that was not the case. I hope you find the podcast interesting, informative and fun.

Looking forward there is an exciting digital event coming up Maximize the Value of Your Data in the Cloud: Achieve unified data governance with Azure Purview . The event is Tuesday, September 28, 2021 | 9:00 AM-10:00 AM Pacific Time (UTC-7) Register here

Monday 13 September 2021

Data Platform Virtual Summit 2021 Keynote


The Data Platform Summit has started. An excellent session covering the Azure data stack identifying when to use each tool and the key innovations with SQL. It was nice to see an explanation of when to use what tool. With so many tools now it becomes hard to know which is the best choice.  This useful chart was shown.



Azure SQL Edge



SQL Server 2019

Solves the modern data challenges 
  • Data visualization and big data clusters
  • Modern platforms with compatibility
  • Built-in machine learning and extensibility
  • Intelligent performance
  • Layers of security and complain
  • Business critical availability

SQL Linux/Container

Containers are portable and can run anywhere containers are supported. They are lightweight with reduced disk, CPU and memory footprint. They have a consistent image of SQL Server, scripts and tools and are efficient with faster deployment, no patching required and less downtime. 

Azure SQL

SQL Server vs. Azure SQL PasS are 
  • business continuity, high availability, automated backups, long term backup retention , geo-replication
  • scale, advanced security, version-less, built in monitoring and built-in intelligence. 

There are Azure SQL Editions for general purpose, business critical and hyperscale



Azure Arc
Bring Azure data services to on-premises, multi cloud and edge with Azure Arc. Azure Arc enabled SQL Managed Instance has many advantages.




The value of the Cloud provides additional tools such as

  • Azure Defender to protect your data
  • Azure SQL Database Ledger for blockchain
  • Telemetry across all your assets with Azure Monitor SQL Insights

Then to help with migration there are tools available

  • Azure Migrate to discover and assess your SQL Server assets
  • Migrate Inline with Azure Data Studio
  • Migrate online with Azure Database Migration Service or Log Replay Service

A few tools were not mentioned that form part of the data suite such as Azure Cosmos DB. The expansion of tools and options has grown significantly over the last few years so it is always good to assess what business objective you are trying to achieve and select the right tool.
 

Friday 3 September 2021

Data Strategy: where are we and what is the answer to the ultimate question

Originally published here

Do we know where we are going? Have we asked the right questions? Without a roadmap, we will not arrive at our destination. The first step relies on discovering where we are, what we need to be successful and where we need to go. We need to create a roadmap to enable a path forward. With that roadmap, there is a need to assign owners of tasks throughout the data journey. Data Strategy is a top-down approach closely aligned with business strategy.

Gartner define a 'Data Strategy' as a highly dynamic process employed to support the acquisition, organization, analysis, and delivery of data in support of business objectives. Whereas DAMA defines Data Management as The development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles.

Many organisations do not have data strategies in place, although they may be working on areas that would sit under that umbrella. Deciding on what the core data principles are, can help an organisation quickly adapt to the data-centric culture.


As an example a set of data principles could be: 

  • All data is owned, managed, secured, and governed
  • Data is managed throughout its lifecycle
  • Data is available and visible whenever needed
  • Information is an asset
  • Use a data catalogue for visibility
  • Data is fit for purpose and meets the business need
  • There is a single version of the truth
  • Data skills training for people to use data effectively
  • Data ethical standards are followed

 

Before any type of strategy is created a business key stakeholder must champion the idea and a person identified to own the strategy, such as the CDO. The data strategy should be maintained and enacted through the data governance team and other working groups.

Data strategy is a framework that is built around the data to amalgamate the assets to create a source of trusted data to allow process efficiencies, increase confidence in the data and create opportunities for innovation. The Data Management strategy could be aligned with the Data Management Association and the DAMA Body of Knowledge (DMBOK), to enable consistent practices and verifiable decision making.

It is important to have an agile data strategy, thus creating a short-term strategy, so the immediate benefit can be gained by the business. Then working on a longer-term target strategy, once the gap analysis is complete and strategic imperatives are identified.​ A couple of core areas to also review are data governance, data ethics alongside data culture and data skills. The technology side for data collection, data storage, data processing and data output may need updating, but if a need exists for technological change, it will be due to the alignment of business and data strategies and identification throughout the process. To enable that agile approach to data strategy using a Boston matrix with the MoSCoW prioritization technique is very successful.

image003

 

DAMA lists deliverables from strategic planning as:

  • Data Management Charter (Vision, business objectives, guiding principles. success measures, risks, operating mode. A business plan to use the information to create competitive advantage and to support enterprise goals).
  • Data management scope statement (goals and objective for planning, organisation roles, responsibilities clarified)
  • Data Management Implementations Roadmap (programs, projects and tasks, road map and milestones). Requires a data management program strategy a plan for maintaining and improving the quality data integrity access and security and mitigating risks.

Taking all of this into account using systems thinking to gain that holistic view there are three areas that should be covered for success: business data strategy, IT data strategy and operational data strategy.

If you haven’t started creating a data strategy or already have one, it is worth reviewing the current state to ensure an agile actionable plan is in place for continuous improvement.