Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Tuesday, 1 December 2020

Christmas 2020 Data pictures: Data Catalogue

 A picture tells a 1000 words. Just a bit of fun collating some key points about Data Catalogues.




Saturday, 14 November 2020

PASS Summit Day 2 SQL Server Evolution

 The day two keynote was delivered by Hanuma Kodavalla a Microsoft Technical Fellow. 

He started with an interesting quote from 'Adventures of a Mathematician' by Stanislaw Ulam

"It is still an unending source of surprise for me to see how a few scribbles on a blackboard or on a sheet of paper could change the course of human affairs."

Then sharing the papers that started it all and the fact that is the 50th anniversary of Codd's paper this year.  

I was interested to learn that the Microsoft Database Research Group is an extension of the SQL product group. 

He mentioned this paper I read a long time ago “One size fits all": an idea whose time has come and gone  M. StonebrakerU. Cetinteme (2005).  The last 25 years of commercial DBMS development can be summed up in a single phrase: "one size fits all". This phrase refers to the fact that the traditional DBMS architecture (originally designed and optimized for business data processing) has been used to support many data-centric applications with widely varying characteristics and requirements. In this paper, we argue that this concept is no longer applicable to the database market, and that the commercial world will fracture into a collection of independent database engines, some of which may be unified by a common front-end parser.

He went through all the previous versions of SQL with their key features ending with SQL Server 2019.







Another key paper was mentioned





Then moved to discuss newer product features of SQL Azure Serverless and Azure Defender

 


SQL Server secure developments include alway encrypted and secure enclaves.




















Then he mentioned the new ledger enabled tables that will be coming soon.
















This was a session through history leading to strive forwards to realize Codd's and Gray's vision. It will be exciting to see what comes next. A great session for an industry person who is a database researcher.







Thursday, 12 November 2020

PASS Virtual Summit 2020 Keynote Day 1

The summit, 10-13 November 2020 is being live streamed through PASSTV.

This year’s first keynote is entitled 'Bringing the future into focus, the end to end Azure Data platform'.

Digital transformation brings change. This change of new technologies can be challenging to learn. For businesses the economies of scale can create efficiencies. The data platform has many elements BI, Analytics and AI, Hybrid Data management, relational databases, Edge and IOT, NoSQL databases, Azure Open Source Database Services (OSS databases) etc. New technologies can help save time. It is good news that the DBA’s key skills are transferable.



Azure services consist of SQL Server on Azure Virtual Machines (for lift and shift and OS level access), Azure SQL Managed Instances (for modernizing existing apps), Azure SQL Database (for build cloud apps). There is also  hyperscale for specific use cases. SQL Server workloads run best on Azure and have patching regimes.

It is good to see that Azure helps customers move at their own pace to the cloud which will enable them to pivot their company with less risk to something new. Azure SQL Server serverless means  pay only for what is required. Azure Edge is an interesting development and is now generally available.


 

With the complex network of applications and systems in the modern data landscape it can be difficult to connect the data to gain value. Data virtualization is an important change.

Azure Arc


Azure Arc is, I think, a real game changer to help with diverse database locations. It is a set of technologies that extends azure management and native data services outside of azure infrastructure to run across your environment even if you can’t migrate to the cloud due to data sovereignty, latency and or regulatory requirements you can still get the efficiency and agility the cloud offers with Azure Arc.

It is a versionless evergreen SQL that ensures you are always current. It provides cloud elasticity on premises which allows optimization of performance of your workloads and dynamically scale up and down without application downtime.   Azure Arc offers unified management which allows you to see your data services running on premises alongside these running on azure through a single pane of glass and manage them using familiar tools like azure portal, azure data studio and azure CLI.

Azure Arc enabled SQL Server Managed Instance and PostgreSQL Hyperscale are in Public Preview

It is possible to use features such as vulnerability assessment and advance threat protection with SQL Defender using the same rules and machine learning algorithms.

The public preview was announced for

  •          Azure Cosmos DB – Serverless for all APIs
  •          Azure Database for PostgreSQL – Flexible Server
  •          Azure Database for MySQL – Flexible Server
  •          Azure Cache for Redis – Enterprise

Cloud Scale analytics on Azure with Azure Synapse Analytics and Azure Databricks

 

Announcing public preview of new guided UI for machine learning models. Then to complete the services, Azure Synapse to analyse the data and then report  on it in Power BI completes the stack.

Thursday, 1 October 2020

SQLBits 2020 Keynote from Edge to Cloud

The keynote for the 2020 SQLBits was delivered by Rohan Kumar on 'Digital Transformation from Edge to Cloud with Azure Data' . This celebrated the tools used to achieve digital transformation at breakneck speed within months and shared the innovations just launching. The Azure Data Strategy includes various tools such as cloud databases, Azure Synapse Analytics and Power BI. These tools enable innovation anywhere, on premises, at the edge and in the cloud. 

The Azure Data Strategy includes many tools, incorporating SQL Server.

Azure SQL Edge delivers intelligence to the edge. Azure SQL Edge is generally available.


Azure Synapse offers a new class of analytics. There were two announcements Public Preview of Azure Synapse Link for Azure Cosmos DB (Synapse SQL Serverless) and Private preview of Power BI performance accelerator for Azure Synapse Analytics.





















Also announced was the Public preview of Power BI app for Teams.




















A keynote full of ideas for innovation, how to drive a data strategy forward and help the environment in which we live in today

Wednesday, 9 September 2020

UK National Data Strategy

The UK National Data Strategy has been published  https://www.gov.uk/government/publications/uk-national-data-strategy/national-data-strategy There are 4 interconnected pillars listed:

  • data foundation
  • data skills
  • data availability
  •  responsible data

 

The National Data Strategy is an ambitious growth for building a world leading data economy. https://www.gov.uk/guidance/national-data-strategy

A summary of the evidence reviewed and evidence gaps can be found here

https://www.gov.uk/government/publications/uk-national-data-strategy/call-for-evidence-and-roundtable-engagement-summaries

The mission is to

Unlocking the value of data across the economy.

Securing a pro-growth and trusted data regime. 

Transforming government’s use of data to drive efficiency and improve public services. 

Ensuring the security and resilience of the infrastructure on which data relies

Championing the international flow of data

It builds upon initiatives such as the Industrial Strategy, the AI Review, the AI Sector Deal and the Research and Development Roadmap – setting out a framework for how we approach and invest in data to strengthen our economy and create big opportunities for us in the future. 

Sunday, 6 September 2020

Microsoft Ignite 2020 - digital event


 It is exciting to see  Microsoft Ignite be a digital event experience on September 22-24, 2020. Despite this being a strange year the 

The sessions catalog is here.The catalog contains some amazinf learning experiences. The session Building Digital Resilience with Satya Nadella will be an interesting session to watch. I find Satya Nadella such an inspirational speaker.

Have you completed your event list?

🗹 Register 🗹 Download Digital Swag 🗹 Schedule sessions 🗹 Schedule Fun & Wellness breaks 🗹 Get favorite comfy outfit

Thursday, 3 September 2020

Spark + AI Summit Europe has evolved

Spark + AI Summit Europe is Expanding and Getting a New Name: Data + AI Summit Europe. In November 2020, there will be the launch of the inaugural Data + AI Summit Europe, officially expanding Spark + AI Summit content and community to include all things data, with a focus on the best open source technologies for building enterprise data applications!
https://databricks.com/blog/2020/09/02/spark-ai-summit-europe-is-expanding.html .  You can also access all of the videos and slides from the 2020 virtual conference. WATCH ON DEMAND

Monday, 31 August 2020

Data Governance Roles

To enable data governance programs to be successful it is important to establish the key roles and define the responsibilities within those. 


Chief Data Officer - a corporate officer responsible for enterprise-wide governance and utilization of information as an asset, via data processing, analysis, data mining, information trading and other means. Wikipedia


Data Stewards - this label describes accountability and responsibility for data and processes  to control the use of data assets. There are varying types of Stewards: 

Enterprise data stewards- oversite of the data domain across business functions

Business data stewards - those who are subject matter experts

Technical data stewards - database administrators, BI specialists , data quality administrators 

It is necessary the every data set has a data owner. A person responsible for the decision regarding the data. They normal are a business data steward. 

Then often there is a data governance steering committee to manage the progress and invoke innovation. 

Saturday, 22 August 2020

Data Governance

Data Governance plays a key role in ensuring data is managed. Data Governance as defined by DAMA is 

"the exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets. "

The successful management of data requires a program that includes:


Strategy

Policy

Standards and quality

Oversight

Compliance

Issue management

Data management projects for improvement

Data Asset Valuation






The goals of data governance are to bring about a sustainable program of work that is embedded in the day to day management of data. Together this should be measured to ensure improvements can be demonstrated and show positive financial impact.

Azure has Governance features and services to explore. 






Wednesday, 12 August 2020

SQLBits 2020 goes Virtual

Very excited to here about the change of event from a face to face event to a virtual event. I don't think a year would be complete without SQLBits. With so many virtual tech conferences this year there is so much choice. My number 1 choice is always SQLBits. I have attended and helped at every event since inception. Further information and how to book your tickets: http://sqlbits.com . The event is running 29th September - 3rd October 2020




Sunday, 26 July 2020

AI Barometer

 

The CDEI has published its AI Barometer, a major analysis of the most pressing opportunities, risks, and governance challenges associated with AI and data use in the UK, initially across five sectors (including Criminal Justice, Financial Services, Health & Social Care, Digital & Social Media and Energy & Utilities).

The key findings

  • The AI Barometer highlights the potential for AI and data-driven technology to address society’s greatest challenges
  • Some opportunities are easier to realise than others
  • Harder to achieve’ innovations, in contrast, involve the use of AI and data in high stakes domains that often require difficult trade-offs
  • Several barriers stand in the way of addressing risks and maximising the benefits of AI and data
  • Three types of barrier merit close attention: low data quality and availability; a lack of coordinated policy and practice; and a lack of transparency around AI and data use.


Thursday, 23 July 2020

Azure Data Catalog

 What is a data catalog? Gartner define this as

“A data catalog maintains an inventory of data assets through the discovery, description, and organization of datasets. The catalog provides context to enable data analysts, data scientists, data stewards, and other data consumers to find and understand a relevant dataset for the purpose of extracting business value.”

Data Catalogs are the New Black in Data Management and Analytics (Gartner, 2018)

A data catalog is important to have to record those critical assets that bring value to data. It becomes a library full of core information about your data sources. It can contain a data dictionary and can provide basic statistics about the data. This is a really useful feature being able to explore the data.

Azure Data Catalog documentation is here and the service here

  • Users can discover the data sources they need and understand the data sources they find. At the same time, Data Catalog help organizations get more value from their existing investments.
  • They are inventories of data in the organization
  • Data catalogs are a standard for metadata management in the age of big data and advanced analytics
  • Adding tags to data sets enable a business glossary of terms to be applied to the data


    


Thursday, 2 July 2020

Microsoft MVP for a third year

During such a life changing time, I am over the moon with joy to be honoured with my third Microsoft Most Valuable Professional (MVP) award in recognition of exceptional technical community leadership for Data Platform for 2020-2021. How amazing to receive this. Thank you @MVPAward #MVPBuzz . There are so many amazing MVP's who help the community improve and grow their data platform knowledge. I am passionate about helping share the knowledge and experience I have gained.

This has been a strange period with many events cancelled and turned into virtual events. As an organizer of  Data Relay this year we cancelled our event to make way for SQLBits to move into the only viable slot this year. As things are still ongoing I think it is unlikely any in person conferences will continue this year.  However there are some amazing free online events to attend. My thoughts have been on helping and supporting the local community as well as the data community. 

Friday, 19 June 2020

Introduction of Power BI

More and more people are using Power BI in their everyday roles. Understanding the different deployment types and basics of the product helps with the adoption of self service data and digital migration. Not every organization has the appetite or capability to use Power BI in the cloud.  The cloud and on premises capabilities do have some differences but it is a great place to engage with dashboards and paginated reports.  The presentations I have given covered these areas.  






Tuesday, 9 June 2020

STEM Role Models - 1 Million Women In STEM

This is women in STEM. They are celebrating 1000 real role models that are kicking ass,smashing stereotypes & breaking barriers. This is an amazing chance to help share women STEM role models for the future generations. Read more and join us 1mwis.com


#womenintech #womeninmaths #WomenInScience #womeninengineering #AcademicTwitter #WomenWhoCode

Wednesday, 3 June 2020

Planning a PowerBI Enterprise Deployment

There is an updated version of the Planning a Power BI Enterprise Deployment whitepaper. Deploying Power BI in a large enterprise is a complex task that requires a lot of thought and planning.

The paper includes these areas for consideration.

Section 1: Introduction Section 2: Power BI Usage Scenarios Section 3: Power BI Architectural Choices Section 4: Power BI Licensing and User Management Section 5: Power BI Source Data Considerations Section 6: Power BI Dataset Storage Options Section 7: Power BI Data Refresh and Data Gateway Section 8: Power BI Dataset and Report Development Considerations Section 9: Power BI Collaboration, Sharing and Distribution Section 10: Power BI Administration Section 11: Power BI Security and Data Protection Section 12: Power BI Deprecated Items Section 13: Support, Learning, and Third-Party Tools

There is a summary of some of the changes at coatesdatastrategies.com/blog/updated-w

Monday, 1 June 2020

Ethical Data Handling Strategy


As a part of the Ethical Data Handling Strategy in my Data Quality Framework there are various things to consider. A good place to start is with The Data Ethics Canvas which can help you identify and manage ethical data issues. 


ODI have produced a clear definition of what data ethics is

“Data ethics is a branch of ethics that evaluates data practices with the potential to adversely impact on people and society – in data collection, sharing and use” The Open Data Institute, 2018.


Data Ethics is important to consider and embed in your data system now. With the use of data in AI expanding it is important to obtain create an Ethical Data Handling Strategy.

Monday, 25 May 2020

The Future of Tech

I enjoyed watching this The Future of Tech session, with Kevin Scott and guests at Microsoft Build. The session discusses advances in large scale models for natural language generation and AI on the intelligent edge among other things. Watch it here


Saturday, 23 May 2020

Build Book of News 2020

What an amazing inspiring conference, where it is possible to make dreams become reality. The world is changing, reimagine tomorrow.

The Book of News 2020 shares some amazing advances to help shape the world to come.

The Microsoft Build 2020 Book of News is the guide to the key news items that were announced at Build.


Thursday, 21 May 2020

Sketch the docs

Great to see an interesting technique shared about visual storytelling. A summary about Sketchnoting and Zines is in the last 15 minutes. There were other sessions are Build

Video: https://aka.ms/msbuild2020-sketchnoting-video
Slides: https://aka.ms/msbuild-sketchnoting-slides
Site: http://sketchthedocs.dev
Blog: http://dev.to/nitya






















Wednesday, 20 May 2020

2020 Build Keynote

The Build Keynote from Satya Nadella was entitled empowering every developer,  on Tuesday 19 May. You can watch it here.

In this time of uncertainty, developers will play a central role in reimaging the world we live in and accelerating our path to recovery. The 3 phases he touched on emergency, recovery and reimaging. Going forward business will need to be able to remote everything at a moments notice, automate everywhere to be agile and simulate anything. The Power Platform, Azure Arc as the first control plane and Teams are enabling the future. Satya left us with a thought provoking statement.

"We are at an inflection point. As developers you have that opportunity, as well as a responsibility, to define what should be rebuilt, what should be reimaged, and what should be left behind." Satya Nadella

We are crossing into a new frontier, anywhere together.



A few of the many interesting announcements follow: 

Microsoft responsible machine learning capabilities build trust in AI systems, developers say
Build AI you can trust with responsible ML

Autoscale is now generally available on AzureCosmosDB, and the launch the public preview of their new serverless model in just a couple of months!


Microsoft Build brings announcements for cloud data, analytics services, and intersection of the two
https://www.zdnet.com/article/microsoft-build-brings-announcements-for-cloud-data-analytics-services-and-intersection-of-the-two/
Azure SQL Edge now in preview
https://azure.microsoft.com/en-gb/updates/azure-sql-edge-now-in-preview/

Microsoft announces a new supercomputer and lays out vision for future AI work.
https://blogs.microsoft.com/ai/openai-azure-supercomputer/
It has built one of the top five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models.