Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Wednesday 28 December 2022

A Data Governance Story Part 2

 The second part of the governance story.










Data Catalogues are the foundation stones of data governance  
Data landscape knowledge is a wonderful thing. The data map enables the organisation of data sources by business flow within a security boundary 
Data passed through many different areas being transformed by the rich data he found along the way 
Being able to find and discover new data with ease, is important to bring a data culture to an organisation 
Data provenance can be such an insightful thing 
Automated data classification in tools such as #MicrosoftPurview help speed up insight into data

Managing data sharing between organisations 
A consistent planned route on a map with accurate place location, validated using a compass. The uniqueness of natural geology in a national park complete with a photo & passed the finish line on time 
Data policies for access, security, data residency governing the use of data amongst groups and locations 
Readiness assessments are a great place to start 
Help a business obtain a standard set of commonly used terms across the teams
Data Ethics is a key component of governing data 

Meta data come in many forms  
Establishing a data governance operating model for a village requires all the interconnected parts to come together as a whole for the best outcome 
Data Strategy sets the path to success 

Wednesday 14 December 2022

Azure Synapse MVP series video

In the Azure Synapse Analytics month's MVP series video, Ryan Majidimehr was joined by two MVPs, myself and Andy Cutler, to talk about the upcoming Azure Synapse conference called Data Toboggan.  https://msft.it/6019eRKad



Friday 9 December 2022

A Data Governance Story

 A tale for December. The business and the frameworks.


Time for some cheer 
Datum and Data on a journey. The Data Owner life 
Data working with other stewards
Business is thriving with well governed data 



Data Management Body of Knowledge (DAMA-DMBOK2)  shares the principals of effective data management
The Cloud Data Management Capabilities Framework (CDMC) provides an industry best practice for data management and analytics  
CMMI’s Data Management Maturity (DMM) Model covers best practices for providing support for the implementation of process in 5 categories  
The Data Management Capability Assessment Model (DCAM) can be used for new data management programs or to benchmark  existing programs progress 
The Control of Data Expediently (CODEX) framework allows a birds eye view of the entire system to be taken, to help combine all the elements and environment 
Data having fun juggling and working with different frameworks to cover all elements

Catch the next instalment later.


Monday 21 November 2022

Alpine Coaster 2022 Schedule

 Our final event of the year. The Alpine Coaster event schedule. To find out more go to the website.



Thursday 17 November 2022

Day 2 PASS Data Community Keynote

The keynote is about Doing More with Less: The Challenges Ahead for Every Data Professional. These are challenging times for every business everywhere. So how can you ensure you get the most from your investment in IT, to meet your needs now, while also preparing you for an uncertain future? Join Jakub Lamik, Steve Jones and Kathi Kellenberger in this keynote which looks at how the world is changing for data professionals, and the areas to focus on which will bring the best return on your investment in the long term.

Speakers

Steve Jones | Advocate

Jakub Lamik | CEO

Kathi Kellenberger | Customer Success Engineer

David Bick | Head of Product Marketing

Arneh Eskandari | Solutions Engineering Manager

There were123 sessions yesterday. Make sure you take advantage of the  networking opportunities.

16.10 Steve Jones opens the keynote talking about Kalen the technical people he had met. Opportunity, to learn and be inspired. 

Redgate are expanding across multi databases with there tools. Redgate live the Devops way, build a great culture internally and community centred. SQL Saturday started to bring it to the community. Redgate gave it to PASS. Then donated PASS away after buying it last year. 

The CEO of Redgate is on the stage.  More than a conference: a homecoming. Why important to invest in the community? Database professionals answer important questions about our world.

Every business is a software business - Satya Nadella

Software means data really, how data is ubiquitous and how it enables transformation.  Change is constant.

How has the database landscape changed.

Open source is increasingly important, all sectors and all segments. A powerful tool for community.

Cross database estate is on the rise.  - From the state of the data estate report.

Fragmentation is accelerating. Developing are influencing the databases they use.

PostgreSQL and MySQL are developing the most. SQL Server in the top 6.

Multi database environments are here to stay and complexity is growing - Steve Jones

Micro Services enable agility between software and dev teams - further drives fragmentation

Complexity is mirrored in the environment and automation required with a standard set of tool and approach. This helps with the Devops environment.

David Bick is on the Stage

Just solving SQL Server problems is just not enough for our customers. A lot of postgreSQL  growing. 

Flyway is a database schema for a standard approach using more than 25 databases. From on prem to cloud. Redgate bought and support the open support project.

Gives richer automation. Design principles - should be able to work the way you want to, flexibility, Tools and approach need to be the same but the changes need to be database specific. 

Flyway Enterprise is released for SQL Server and Oracle. This will unlock devops for developers.

Migration to cloud is expanding and Redgate can deploy in hours. Cloud required collaboration. No more silos. DBAs must be database agnostic and support hybrid database estates. 

Arneh  is on the Stage. Security and monitoring is centralised, but individual business units chose different platforms. 

Scalability and being held back . Domain expertise is needed from various cloud technologies. Not enough dashboards across all of these.  Cost control is required to be kept as low as possible. 

A typical financial set up 1000+ database 50/50 split on prem and in the cloud in the UK and US. They need granular monitoring as there is no holistic monitor. 

About solutions, community and support. 

How to help you grow. There are lots of new speakers at Summit this year.  How many children want to be a DBA when they grow up. Simple talk is a great launching point for your career. Louis Davidson is the new editor of simple talk. 

Adoption, results and advocacy in customer success. They are the face of the customer internally and getting value out of there products. Customer success is who Redgate are. Also further ongoing professional development.  There are free industry learning classes which were created in the pandemic.

The debut Redgate 100 most influential in the database community 2022 was shared.

Day 1 Keynote PASS Data Community Summit

The day 1 keynote of the first in person PASS Data Community Summit was packed with exciting data platform innovations across SQL Server and the Microsoft Intelligent Data Platform.

The Keynote: Transform your Data Estate with Microsoft's Intelligent Data Platform.







There are many data challenges today within operational database, analytics and intelligence and data governance. 
 


Data officers are a critical role and it is a challenge to use the data right. A lot of governance tools are not fully integrated. 

Analytics is a core part and must be front and centre not a back end tool. Data Governance is more than creating a catalogue, it is hybrid and multi cloud and the volume of data is increasing. Data Governance needs to be deeply embedded.


The big announcement : SQL Server 2022 is generally available


SQL Server 2022 with embedded data governance. All the data management through Microsoft Purview, lineage tracking is very important. 

There is a new SQL Server pay-as-you-go licensing model 


The annoucements

SQL Server 2022 - GA (Generally Available)
New SQL Server pay-as-you-go licensing model
Link feature for Azure SQL Managed Instance
Backup portability with SQL Server 2022
SQL + Apps Migration Factory offer
Azure Cosmos DB for PostgreSQL - GA
Azure Data Factory SAP Change Data Capture Connector -GA
Azure Synapse Mapping Data Flow for M365 Graph -Public Preview
Azure Synapse Link for SQL











The keynote  then switched to data governance



Improved root cause analysis and traceability with SQL Dynamic lineage is GA

PASS in Seattle

My visit to PASS Data Community summit in Seattle involved my first flight out of the UK since Covid, 3 speaking opportunities and a visit to the Microsoft Reactor in Redmond to speak with the product teams. It is nice to be out travelling to conferences again and networking with the data community. There is so much added value to be had.

 



Monday 14 November 2022

PASS Data Community Summit Sessions

I have 2 speaking sessions at PASS Data Community Summit on Wednesday 16 November. The live Q&A from my on demand session on: 

The adoption of Data Governance is often marred with thoughts of control and security. Data Governance is about much more in the new world. It is about driving business transformation, knowing what value your data has, and how to remove those data silos, bringing with it a new data culture. In this session, we will start to explore what the new world of Data Governance is and how we help businesses understand the benefits, why it is important, and how to get started in simple agile steps. This session will help you understand why Azure Purview brings exciting features that slot into the world enabling data governance with ease.  I share more about this in a blog post building data governance into everyday processes







A panel session exploring the role of the database in today’s digital transformation and modernization initiatives. What are the experiences of transformation initiatives and why the database has to be included in these innovation drives for the benefit of increased speed, efficiency and risk management. I share more about this in a blog post The two sides of innovation: Data and DevOps











Thursday 10 November 2022

Data Toboggan - Alpine Coaster

We have a  mini conference to close the year out called Alpine Coaster on 25 November 2022.

How did Alpine Coaster get its Name

The longest toboggan run in Switzerland is the Pradaschier Rodelbahn. It has 31 curves and a difference in elevation of 480 m. The toboggan run coasts its way down to the valley with twists and turns. It is 3,060 m long. It is the most extensive alpine coaster run in Switzerland. It can move up to 40 kilometers/hour.

It takes between 7-10 minutes to ride Switzerland's longest toboggan run, the Pradaschier Rodelbahn.

Alpine Coaster Challenge

Create a Session between 7 – 10 minutes long, so bite-size sessions for hungry data professionals. This hopefully will leave everyone all excited for our full conference 28 January 2023.

Alpine Coaster Time Zones

Alpine Coaster will have 3 open sessions across 3 time zones (APAC, EMEA, AMER)
Each session lasting about 60-90 or so minutes each. This is a Friday event.

What type of Content is Expected

Share your tales of Azure Synapse Analytics in the real-world.  Deployments, proof-of-concepts, solved issues. 

This event will not be recorded. Use this to your advantage to be creative without pressure.

Register Now

Register for Alpine Coaster APAC Edition 7:30 AM GMT Friday 25 November 

Register for Alpine Coaster EMEA Edition 12:30 PM GMT Friday 25 November 

Register for Alpine Coaster AMER Edition 5:00 PM GMT Friday 25 November













Tuesday 1 November 2022

Purview changes


There have been a hole raft of changes to Microsoft Purview of late which can be found on the Microsoft Purview blog announcements can be seen on the 
Security, Compliance, and Identity Blog

Report Manual Data lineage with few clicks in Microsoft Purview https://techcommunity.microsoft.com/t5/security-compliance-and-identity/report-manual-data-lineage-with-few-clicks-in-microsoft-purview/ba-p/3655228

Now in Public Preview: Microsoft Purview workflows HTTP connector 

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/now-in-public-preview-microsoft-purview-workflows-http-connector/ba-p/3655281

Best practices for Purview and a federated way of working

https://piethein.medium.com/best-practices-for-purview-and-a-federated-way-of-working-7a146f10b3ac

Catalog Adoption: Discover more with Data estate insights in Microsoft Purview

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/catalog-adoption-discover-more-with-data-estate-insights-in/ba-p/3656606

New machine learning classifiers in Microsoft Purview Governance

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/new-machine-learning-classifiers-in-microsoft-purview-governance/ba-p/3663629

Data curation: Discover more with data estate insights in Microsoft Purview

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/data-curation-discover-more-with-data-estate-insights-in/ba-p/3662971

Monday 17 October 2022

Microsoft Purview Ignite 2022 Announcements

There were several announcements at Microsoft Ignite about Microsoft Purview. 

  • ML Based Classification
  • Manual Lineage
  • Dynamic Lineage for Azure SQL Database
  • Metamodels
  • Self-service access for Azure SQL Database

The details are







Saturday 15 October 2022

Business Context for your technical data

To enable data to be managed well it is important to have a business lens applied to the data.  To that end Microsoft have added a new capability to enrich the Data Map with business and governance context to make it relevant for the Data Stewards, Chief Data Officers, Data Analysts, and Data Scientists. This new feature is called Purview Metamodel. Three elements are organisation, business processes and data products. This shows how the data is used in business activities, enables the organisational hierarchy to be defined and , defines data used across the business departments and business processes. This is all placed this in the hands of the Data Steward.




To add further enrichment and augmentation to Purview Data Map it provides end-to-end data lineage which the Data Stewards can now annotate so their data in manual data lineage data movement is shown.

Further Reading

Add business context to your hybrid data estate with Microsoft Purview

Thursday 13 October 2022

Microsoft Ignite 2022


The Microsoft Ignite 2022 keynote on 12 October shared many innovations and there were lots of announcements. The Microsoft Book of News October 12 - 14, 2022  details these  https://news.microsoft.com/ignite-2022-book-of-news/

Satya talked about the digital imperative and the world's computer - Azure. The main 5 themes are to do more with less.









The technology world is changing rapidly and Satya mentioned that Gartner predicts that by 2025, 70 % of applications will be made by no code/low code tools, up from 25% in 2020. enabling hand drawn forms for example to be converted into apps with  AI

Microsoft and Databricks deepen partnership for modern, cloud-native analytics

https://techcommunity.microsoft.com/t5/azure-data-blog/microsoft-and-databricks-deepen-partnership-for-modern-cloud/ba-p/3640280  

Microsoft and Databricks have partnered to build a foundation in the Microsoft Intelligent Data Platform by integrating their hallmark capabilities to create an integrated solution for our customers.



Distributed PostgreSQL comes to Azure Cosmos DB

https://devblogs.microsoft.com/cosmosdb/distributed-postgresql-comes-to-azure-cosmos-db/

Azure Cosmos DB for PostgreSQL, a new Generally Available service to build cloud-native relational applications. Azure now offers its own single database service that supports both relational and NoSQL workloads. You can build cloud-native applications for relational and non-relational data using Cosmos DB.











Introducing the Microsoft Intelligent Data Platform Partner Ecosystem

https://techcommunity.microsoft.com/t5/azure-data-blog/introducing-the-microsoft-intelligent-data-platform-partner/ba-p/3640279

There was a launch of a powerful new Partner Ecosystem for the Microsoft Intelligent Data Platform deliver category-leading and cloud-native data and AI solutions integrated with the Microsoft Intelligent Data Platform to complement capabilities to address diverse scenarios.













Data Governance

Data Governance was mentioned at every single opportunity in many sessions. Data Governance looking at security and compliance, data management and responsible democratisation.  Announcements were Microsoft Purview Business workflows (GA), Business metamodel (Preview), Improved root cause analysis and traceability with SQL Dynamic lineage 













More details in the book of news shares:

  • Improved root cause analysis and traceability with SQL Dynamic lineage (now generally available) and fine-grained lineage (in preview) on Power BI datasets. You can do thorough root cause analysis from a single location in Microsoft Purview.
  • Metamodels that will enable customers to define organization, departments, data domains and business processes on their technical data. This feature is in preview.
  • then the Machine learning-based classifications will make detection of human names and addresses simple and scalable in user data. This feature is in preview.

These features with help with big data management, adding context on top of  data, improved classification with AI, manual process lineage and scorecard insights.  You can read about it below

Add business context to your hybrid data estate with Microsoft Purview

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/add-business-context-to-your-hybrid-data-estate-with-microsoft/ba-p/3651989

Customize retention and deletion to help meet your specific business requirements

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/customize-retention-and-deletion-to-help-meet-your-specific/ba-p/3613111

Read the article to learn more about the announcements in the Data Lifecycle and Records Management space that help organizations manage the lifecycle of data.

What's New in Microsoft Purview Compliance Manager

https://techcommunity.microsoft.com/t5/security-compliance-and-identity/what-s-new-in-microsoft-purview-compliance-manager/ba-p/3643375

Compliance manager helps

  • Eliminating blind spots with the right set of security, compliance, and privacy controls
  • Safeguarding critical data from external and internal threats
  • Identifying risks and addressing regulatory compliance requirements

There has been additional automated controls for Microsoft Priva and App Governance announced.

Power BI

Power BI updates Do more with enterprise self-service business intelligence  https://powerbi.microsoft.com/en-us/blog/microsoft-ignite-2022-do-more-with-enterprise-self-service-business-intelligence/ 

with video summary https://www.youtube.com/watch?v=PxGVcFb-zv0&t=28s









Tuesday 13 September 2022

PASS Summit 2022 session schedule

 I am pleased to share my session schedule 










Business Benefits of Good Governance

Summit - Day 1 09:30 AM - 10:00 AM PST

Category: Live  Q&A

Theme: Ahead of the curve


Transformation and Innovation: Why the Database Must be Included

with Steve Jones and Joshua Higginbotham

Summit - Day 1 02:30 PM-03:45 PM PST

Category: Panel

Theme: Ahead of the curve


Data Governance with Azure Purview - Ask the Experts ( Now Microsoft Purview)

With Erwin de Kreuk and Wolfgang Strasser

Summit - Day 3 09:30 AM - 10:45 AM PST

Category: Panel

Theme: Revolutionary

Monday 12 September 2022

Access control in the Microsoft Purview governance portal

Microsoft Purview has a set of predefined roles with differing levels of access defined here. There is always a lot of reading on these pages but further down I spotted a useful chart. Diagrams help show the access clearly.



Thursday 8 September 2022

Metadata Management

Metadata is an important facet of data entities and sits within the data governance space.  Meta data is data about data. There are various categories of metadata.  

  • Business metadata describes all aspects used for governance, finding & understanding data.
  • Technical metadata describes the structural aspects of data at design time. 
  • Operational metadata describes processing aspects of data at run time.
  • Social metadata describes the user perspective of the data by its consumers.

Often metadata connects business domains, processes, technology and data. A good place to start is at the technical layer and a tool such as Microsoft Purview can be useful to make this scalable. The end goal is understandability of the data leading to good data quality.



Saturday 3 September 2022

Purview Git Hub Resources

There are a set of solution accelerator on GitHub for Microsoft Purview. These are:

microsoft/Purview-ADB-Lineage-Solution-Accelerator

A connector to ingest Azure Databricks lineage into Microsoft Purview




microsoft/Purview-Machine-Learning-Lineage-Solution-Accelerator

Solution accelerator to help build Machine Learning Lineage




 

microsoft/Purview-Custom-Connector-Solution-Accelerator

Solution Accelerator to help build Purview custom connectors




microsoft/Purview-Custom-Types-Tool-Solution-Accelerator

Solution accelerator for creating custom type definitions in Microsoft Purview.



 


Monday 15 August 2022

DAMA-DMBOK2, DCAM and TOGAF methodologies

 














I came across this article giving a comparison between what is included in DAMA-DMBOK2, DCAM and TOGAF methodologies. I mentioned the core framework elements here. The most used data models by the industry are DAMA-DMBOK2 by the DAMA International and DCAM® 2.2 by the EDM Council.

No one model covers all areas and no one company is the same and it is very common that different bits are used as and when required. It is worth reading the discussion in the blog. 

Wednesday 10 August 2022

2022 Data Platform Microsoft MVP Award

 


Excited to receive my 5th MVP award kit containing my 5th disk to go on my MVP Crystal Award.

I am also looking forward to receiving my 5 years milestone disk as an MVP in a few weeks as well. Working with community events enabling others to get access to free training and helping others learn is so important. Here is to the next year and seeing what exciting things we can make happen.

Friday 29 July 2022

Microsoft Purview Data Sharing

Microsoft Purview Data Sharing allows in-place data sharing for Azure Data Lake Storage (ADLS Gen2 ) and Blob Storage. This solves the problem of data proliferation where you can often end up with multiple copies of the same data resulting in higher storage costs and tracking issues from data movement. This also reduces the need for more ETL packages to manage and reduces the time it takes to get access to the data. Managing the data movement can be time consuming so this data sharing feature will help reduce this.

This new feature will enable better data management, controlled access to the data with automated data sharing.  The feature works as as below:








 A few things to note   

  • the data provider pays for data storage and their own data access  
  • the data consumer pays for their own data access transactions.

 Access can be changed 

  • the data provider can revoke access to the share or set a share expiration time for time-bound access to data. 
  • the data consumer can also terminate access to the share at any time.
More details can be found here

Watch demo video

Read How to share data

Read How to receive share

Thursday 28 July 2022

Azure Synapse Community Resources

Microsoft have released a collection of community resources to help accelerate learning and enable interaction with other members of the community. Resource guide here

In addition to the Microsoft resources are those created by presenters at the Data Toboggan set of Azure Synapse conferences. Follow, share and like us. 

Social Media

Twitter: https://twitter.com/datatoboggan  @datatoboggan

LinkedIn: https://www.linkedin.com/company/data-toboggan

Conference Session recordings

YouTube: https://www.youtube.com/c/DataToboggan 




Download Microsoft guide here







Community Call to Action

 Follow us on Twitter: @Azure_Synapse

 Read and comment on our blog: https://aka.ms/SynapseBlog

 Check out our monthly updates blog: https://aka.ms/SynapseMonthlyUpdate

 Subscribe to our YouTube channel: https://aka.ms/SynapseYouTube

 Share and vote for ideas to improve Azure Synapse: https://aka.ms/SynapseIdeas

 Join the Azure Synapse Influencers program: https://aka.ms/SynapseInfluencers

Product 

 Product page: https://aka.ms/Synapse

 Documentation: https://aka.ms/SynapseDocs

 Reference architectures: https://aka.ms/SynapseArchitectures

 Security whitepaper: http://aka.ms/SynapseSecurity

Technical Discussions and Q&A

 Microsoft Q&A: https://aka.ms/SynapseQuestions

 StackOverflow: https://aka.ms/SynapseStackOverflow

Quickstart

 Get started in 60 minutes: https://aka.ms/SynapseGetStarted

 Azure Synapse Analytics Toolkit: https://aka.ms/SynapseToolkit

Free Learning

 MS Learn – Learning paths: https://aka.ms/SynapseLearningPaths

 Synapse Practitioner : https://aka.ms/SynapsePractitioner

 Microsoft Virtual Training Days: https://aka.ms/SynapseMVTD

 30-Day Cloud Skills Challenge: https://aka.ms/SynapseSkillsChallenge

Samples & Accelerators

 Samples on GitHub: https://aka.ms/SynapseSamples

 Accelerator – End-to-End Analytics: https://aka.ms/azsynapsee2e-git