Monday, 29 November 2021

PASS Summit Key note Unified Data Governance with Azure Purview

Raghu Ramakrishnan CTO Data, Technical Fellow, Microsoft spoke at PASS Community Summit in November and explained the next part of the vision, policy, for data governance. Microsoft are seeing data governance as the emerging data pillar. Operational databases, unified analytics platform, and unified automated data governance. The unified part is the important element going forward, a unified single pane to extend governance across the entire data estate. Automated data classification to remove the PII headache of missing personal data and pushing the control up the stack to knowledge workers. Microsoft intend to have dynamic data providence that is fully integrated with the 6 responsible AI principles.  Azure Purview will operate a Central RBAC control and is the governing permission future state for SQL Server with full propagation. With AI integrated the policy feature will be human readable. The link to watch the session .



Data governance is increasingly interdisciplinary and the discovery of data core to a business. Questions often asked: what data do we have? where did the data originate? can I trust the data?

Compliance is an area which had been a major area of data governance. Questions often aske here:
what’s my exposure to risk? is my data usage compliant? how do I control access and use the data? what is required by regulation X?

Raghu talked about the data governance journey through the lens of GDPR Compliance. There approach was to create a 'Data Map' of all data across Microsoft and use that map to support GDPR compliance. The data discovery looking at search and discover, information supply chain, steward/curators and business glossary. Then looking at the data use governance and policy author/manage, reporting, access and governance enforcement and industry compliance. These two areas were built on intelligent data inventory - built on a data map with automated structure & lineage collection, automated & custom classification and publication / subscription APIs. 


Purview data catalog is a self service tool filled with details from knowledge workers. Areas include:
  • self-service search and browse
  • curated and standardized business glossaries
  • interactive lineage visualization
  • simplified data curation and stewardship
The data estate insights currently show these
  • data asset distribution
  • business glossary
  • data classification and labelling
  • data location and movement (in progress)
The Microsoft vision is: data in the Microsoft cloud is always governed and beyond Azure, Purview offers a single pane to extend governance across the entire data estate.

Still looking through the lens of GDPR Compliance data classification is an important feature





Dynamic lineage deep dive


He talked about the increased efficiently of extracting of dynamic SQL provenance and the 6 responsible AI areas of fairness, inclusiveness, reliability and safety, transparency, privacy and security and accountability. Talking about responsible AI and provenance with  machine learning  (ML) training and audit with the provenance of ML models as a requirement. There are a number of challenges address to enable this. 

Centralized data access control

Proactive governance controls look at things like


Policy enforcement inside data services - access control was explained (in the early stages so may change)


In the future there was mention of an ABAC Policy language ABAC = RBAC + Conditions. A human readable policy language for business users like data officers or data owners. A policy statement can be represented as a tuple of {Effect, Action, Data resource, Subject, Condition}.  The propagation of Purview polices to data repositories is asynchronous in design with Purview as the single source of the truth. SQL pull updates asynchronously,  and updates are thus not immediately visible locally like AAD logins. 

The summary of the presentation was Purview is creating a new data pillar of unified governance across the entire data state. It is deeply integrated with SQL Server, extending its governance capabilities significantly.

Sunday, 28 November 2021

Data Toboggan 2022



We are pleased to announce that Data Toboggan 2022 is back on 29 January 2022. 7:45 AM to 7:59 PM GMT

Call for Speakers https://bit.ly/DT22CFS 

Register https://bit.ly/DT22Register

Details

Join us for our THIRD all-day event specializing in Azure Synapse Analytics !

Azure Synapse Analytics is a practically limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. Let's spend a day exploring and showcasing these capabilities.

It is this analytical power that will help enable any organisation to transform from being reactive into being truly proactive, generating actionable insights that enable both business flow and timely decision-support.

This is a virtual event and free to attend.

As it's our third event, there will be 3 session types:

  • Standard Sessions : 45 minutes long.
  • Live Short Talks : 5 to 10 minutes.
  • Recorded Short Talks : 5 to 10 minutes.

We'll have most awesome content we can find, with a wide range of speakers and experience. Check out our CFS page (https://sessionize.com/data-toboggan-2022/) if you're thinking of submitting !

Friday, 26 November 2021

Event Synopsis for Azure World Synapse Day

We were really excited to have run a new type of event, Azure World Synapse Day. The event crossed 3 time zones APAC, EMEA and AMER.

The aim of the unconference was to have a lighter format that allowed people to share their personal stories, to share things about Azure Synapse technology, provide demos etc. Our interpretation of the unconference was 

Our session planning run list looked like

Followed by an intermission

Followed by an intermission

We had fun, we learnt a lot about running this type of event and wanted to thank our amazing speakers and attendees for sharing the event with us. 

We will be running the same type of event next year. We learnt a lot about this type of event. It will be coming further under the Data Toboggan brand as Data Toboggan - Alpine Coaster. So coasting through those shorter sessions.



Tuesday, 16 November 2021

T-SQL Tuesday #144 – Data Governance reimagination - Wrap up

This month’s T-SQL Tuesday attracted some great responses! Thank you to everyone who participated!

My invitation for this month’s #tsql2sday was 3 fold on sharing your experiences on data governance

  • The current cost of data governance versus its benefits
  • The amazing things data governance has enabled you to achieve or will enable you to achieve in the future
  • The potential uses for Azure Purview within your estates and the automated deployment options for that

Rob Farley published a post in reply

http://blogs.lobsterpot.com.au/2021/11/09/being-sure-of-your-data/

Rob raises some key points 

  • But the checks that we do are more about things that the database can allow, but are business scenarios that should never happen.
  • You need to discover which situations cause people not to trust the data.
  • Data quality can lead to the trust, but only when it has been demonstrated repeatedly over time. Trust must be earned

Deborah Melkin published a post in reply

https://debthedba.wordpress.com/2021/11/09/t-sql-tuesday-144-data-governance/

Deborah Melkin talks about the switch to implement data governance.

  • It is about understanding your data from both the micro and macro level
  • It’s understanding where our data lives (data assets) and how data flows through data sources (data lineage) as well as how it’s consumed and used (data catalogs and data profiling). More importantly, this is knowledge that can be shared to make data even more valuable.
  • When you start expanding the number of databases and the complexity of how your systems work, the job of governance becomes a lot harder
  • Getting started with data governance seems like a very daunting task.

Data Governance is a broad topic with many different areas which can be be seen from the replies. There is plenty for us to get started with and I'm looking forward to using Azure Purview to help with this. 

Thank you for taking the time to post insightful posts. That is the wrap up. If I’ve missed anyone please let me know and I’ll update the post.

Thursday, 11 November 2021

Drive a data culture to power a new class of data first applications

The PASS Data Community summit session keynote contained a section presented by Arun Ulag, Corporate Vice President of the Intelligence Platform at Microsoft.

From data to intelligence for everyone and for every decision at any scale. He talked about data integration, analytics and business intelligence. The 3 messages were:

Empower every individual with AI capabilities such as the automatic report insights in Power BI with descriptive and diagnostic insights and insights on the move. 


Empower every team with Power BI and Teams, your data is where you chat. Power BI have goals that include being driven by data, built for teams, AI powered and automated action.

The third message was empower every organisation with a complete analytics fabric, Power BI + Synapse.

The public preview of Hybrid Tables automatic aggregations was announced.. 

Observation data is the fasted growing data segment with 175ZB of data expected by 2025 and 50 billion connected devices by 2030.

Also Azure Synapse Data Explorer was announced as now in public preview.














Wednesday, 10 November 2021

Bridge to a new universe: the end-to-end Azure Data Platform

 

Exciting to watch the Day 1 Keynote a Bridge to a new universe: the end-to-end Azure Data Platform delivered by Rohan Kumar and many other people. 

A journey to a new universe is just waiting to inspire innovation, to tap into limitless possibilities and potential. It covered how to shape your data so you can harness its power to find a new galaxy of insights, answers, and predictions.  Some amazing slides and discussion to set you on a new path. 










Three universes bridged together unmatched analytics and insights, limitless cloud data services and unified data governance.

Rohan shared a great quote "If you want to go fast, go alone. If you want to go far, go together. " The SQL Server community has always worked together to achieve some amazing goals. 
Three main data communities were discussed SQLSaturday, Data Saturdays and the Azure Data Community
SQL Server 2022 preview brings with it many new features.  













There are to be two interlinking services Azure Synapse Link and Azure Purview. 

























More details were discussed in the keynote but I will share those separately. As ever an inspiring future for us in the data community. 

Read More

Microsoft Ignite book of news

 Announcing SQL Server 2022 preview: Azure-enabled with continued performance and security innovation

Data Toboggan Azure World Synapse Day: Speakers



Data Toboggan have an amazing line up of speakers. We would love you to join us and support our amazing speakers who have given up their time to speak. 

Register now https://bit.ly/RegisterDTWSD21 

Speakers

Lakehouse in a nutshell: Serverless SQL pool + Aggs + PowerBI - Armando Lacerda

DW Automation for EDW in Synapse  - Demo Only - Bob Duffy

Manage Packages on Synapse Spark - Dustin Vannoy

Migrating a Data Warehouse to Synapse Analytics - Andy Cutler

Patterns with Synapse Notebooks - Damien O'Connor

From Housekeeping to Data Engineer - My journey to find my passion -Jean Joseph

Secrets of SQL Dedicated Pool - Dennes Torres

Spreading the word about Azure Synapse Analytics - Sidney Cirqueira

Synapse and Power BI - Intro to a great data mix - Gaston Cruz

dbt & Synapse: have you seen SQL do this before? - Anders Swanson

Distributed Data in Dedicated SQL Pool - Rob Farley

Tuesday, 9 November 2021

Data Toboggan extravaganza

There nothing so exciting as a surprise. Data Toboggan is trying something different. Take an international journey with us through 3 time zones: 

APAC - 08:00 - 09.00 GMT; 

EMEA - 12:00 - 13.20 GMT; 

AMER - 17:00 - 18.20 GMT

Bit size sessions to share how Azure Synapse Analytics inspired you, empowers you, and how it accelerates your business analytics. Register now: https://bit.ly/RegisterDTWSD21 

The Session Titles










The Abstracts










The Speakers

Jean Joseph, Dennes Torres, Anders Swanson, Rob Farley, Bob Duffy, Andy Cutler, Damien O'Connor, Sidney Cirqueira, Armando Lacerda, Gaston Cruz, Dustin Vannoy



Tuesday, 2 November 2021

Ignite Innovate Anywhere From Multicloud to Edge

Innovate Anywhere From Multi-cloud to Edge with Scott Guthrie shares a raft of technical updates. The themes
  • Hybrid and multicloud
  • End-to-end data platform
  • Cloud native development
  • Developer velocity

Hybrid and multicloud updates announced
  • Deeper integration with VMware vSphere and Azure Stack HCI
  • Azure Virtual Desktop on Azure Stack HCI
  • Azure Arc enabled data services updates
  • Extension of Microsoft Defender to AWS
A data platform with end to end capability


SQL Server 2022 was announced




Azure Synapse Analytics announcements
  • Azure Data Explorer
  • Synapse Link SQL Server 2022
  • Synapse Link Dataverse (GA)





Great to see Azure Purview and data governance mentioned everywhere. 



















Microsoft Ignite November 2021









Microsoft Ignite was opened by Satya Nadella on 2 November 2021. An inspiring session. 

The headline for the opening was, our economy and society is undergoing a sea change of digitization. Satya talked about emerging technology trends and innovations across the Microsoft Cloud that will transform every business and industry going forward.

We are a moment of real structural change. The case for digital transformation has been never so urgent. What will happen and what we need to do to support our business is core with the transition of mobile to a cloud era to ubiquitous computing and ambient intelligence.

There are four key trends that he mentioned

  1. Hybrid work - when and where we work
  2. The trend for a hyper connected business with omnichannel reach with freely flowing data and intelligence
  3. Every business is a digital business - multi cloud multi edge
  4. The need to protect everything end to end, with security being the biggest risk

There is also a need for business to meet sustainability goals and track our own carbon footprint. 

Microsoft Loop was announced a new collaborative canvas.

There were some other great transformational announcements. 




Azure Arc-enabled machine learning – inferencing: Customers can now build, train and deploy machine learning models in on-premises, multi-cloud and edge computing environments


There was a new Cognitive Service 'Azure OpenAI Service' announced . Azure OpenAI Service is a new Azure Cognitive Service that provides customers access to OpenAI’s GPT-3 models with enterprise capabilities such as security, compliance and scale. He also talked about the breakthroughs in natural language processing .


Teams Connect is the centre place for future collaboration. 

A new platform layer, the metaverse, that brings together many tools. The Metaverse solutions 


Mesh for Microsoft Teams was announced. An immersive experience in Microsoft Teams is using Mesh. Mesh for Microsoft Teams will enable new experiences with personalized avatars and immersive spaces where users can connect with presence and have shared immersive experiences.

Read the Microsoft Ignite Book of News: http://aka.ms/ignite-book-of-news for exciting news and updates. A major theme is about inclusivity and accessibility.

T-SQL Tuesday #144 – Data Governance reimagination


Data governance is a topic that has raised its head again in the last year, with the introduction of a GA service called Azure Purview.  Data governance is not a new topic in the realms of data management.  I think over the last few years data governance has had a focus on meeting data protection law, government legislation and formalised control and standards. Then with data sovereignty issues being everyday considerations within the global market place, storage locations in the cloud and the likes of the General Data Protection Act, data governance has been very focused on meeting legislative requirements.

There has been a substantive cost involved in setting up data governance within an organisation to avoid those heavy fines if a data breach were to occur. I think because of this many organisations compartmentalise personal data systems.

A change in perspective, a reimagination, of data governance is occurring. Data governance is really about ‘data erudition’, showing an interest in learning about the data we have, improving the quality and creating a more productive and trusted data asset. Starting small and incrementing data change in a way that matches the business need, is a way to gain targeted business value. Azure Purview provides this great opportunity for us to start at the beginning and create a data catalog, data inventory, data dictionary and much more in an automated way.  Trusting the data quality, knowing your risks and what data you have sets your business up for success.

My invitation to you for this month’s #tsql2sday is…

I want to invite you to share your experiences on data governance

  • The current cost of data governance versus its benefits
  • The amazing things data governance has enabled you to achieve or will enable you to achieve in the future
  • The potential uses for Azure Purview within your estates and the automated deployment options for that

*** The Rules ***

  • Your post must be published on Tuesday 9th November 2021. This counts as long as it’s still Tuesday anywhere in the world.
  • Include the T-SQL Tuesday Logo and make it link to this invitation post.
  • Please send me a tweet @victoria_holt with a comment to this post with a link to your own so I know where to find it.
  • Please tweet about your post using the #tsql2sday hashtag.

Monday, 1 November 2021

Data Erudition
















I have been thinking about data governance, Azure Purview and data research. How these areas link together and enable innovation, that democratisation of data and holistically thinking about data as a strategic asset. A combination of above in my diagram covers my thoughts.