Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Wednesday 22 May 2024

Microsoft Build Fabric: What's new and what's next

The Microsoft Fabric announcements were covered by Amir Netz, Arun Ulagaratchagan, Flavien Daussy, Adam Penhaul. The session is recorded and can be seen here Microsoft Fabric: What's new and what's next.

I live blogged this great main data session at Microsoft Build.

AI is changing the world. AI revolution is based on Data. Data is the fuel that powers AI. It is hard because of the amount of innovation and lots of diversity and complexity.




Purpose built workloads. AI is built into Fabric. Governance is particular important and built in and driven through Microsoft Purview.

Aka.ms/try-fabric



There are weekly Fabric released with 60-80 pages of blogs . The roadmap for these features can be found at

Aka.ms/FabricRoadmap


What is the point of having data in the lake if no one is using it. It is a bout immediate business access to the data

A SaaS product that looks like Office.  No knobs to optimise Fabric. Results in hours.

  • Starts with built-in CI/CD
  • Creating deployment pipelines
  • And Taskflows (public Preview) to provide help to create things like the medallion architecture.


In Fabric you can now bring in partner workloads such as MDM and ESRI. It was announced Microsoft Fabric Workload Development kit as Public Preview.


Al your data, all your teams in one place. You can publish to workload hub for a native fabric workload experience. Aka.ms/FabDevKit

There are multiple methods to get data into Fabric for multi-clouds. Shortcuts to On-Premises Sources for OneLake was announced as Public Preview.


Not everything stored in open formats like databases, so Mirroring helps with this. There is Free Mirroring storage for Replicas. 


Delta format is not the only open format. Iceberg is another major storage function.  There is transparent simultaneous support of Delta Lake and Iceberg formats just announced. It is now possible to also connect to Salesforce and not move the data.  Also now an expanded partnership with snowflake and Adobe.

To have unified API with the public preview of the developer friendly API for GraphQL to all data in OneLake. (GraphQL uses JSON structures).

Unified data culture requires real time data. Microsoft announced Real-Time Intelligence. It uses the the Real Time hub powered by AI for data in motion. (OneLake data hub is for date at Rest)


So Real-Time Intelligence in the real world.

Copilot is integrated in every Microsoft Fabric Experience. Copilot in Fabric is now Generally Available.  This means AI driven insights drive insights out of the box and with custom generative AI for your data. 


Announcing Public Preview of AI Skills in Fabric.  It allows you to build your own Generative AI in Fabric

Simple to get started

  • Create AI Skill
  • Add data – ground in data
  • Select tables to ground the data

Query in natural language

In conclusion come and Join the Microsoft Fabric Team in Stockholm, Sweden 24-27 September 2024

Aka.ms/FabCon-Europe



Tuesday 7 May 2024

Responsible AI Transparency Report

Microsoft have shared how they work with AI responsible in this paper  Responsible AI Transparency Report How we build, support our customers, and grow.  The report outlines Microsoft’s approach to building generative AI applications responsibly, adhering to six core values of transparency, accountability, fairness, inclusiveness, reliability and safety, and privacy and security.  The framework is all based around the govern, map, measure and manage cycle.  

Govern 

Establishes the context for AI risk management, including adherence to policies and pre-deployment reviews.

  • Policies and principles
  • Procedures for pre-trained models
  • Stakeholder coordination
  • Documentation
  • Pre-deployment reviews

Map 

Involves identifying and prioritizing AI risks and conducting impact assessments to inform decisions.

  • Responsible AI Impact Assessments
  • Privacy and security review
  • Red teaming

Measure

Implements procedures to assess AI risks and the effectiveness of mitigations through established metrics.

  • Metrics for identified risks
  • Mitigations performance testing

Manage

Focuses on mitigating identified risks at both the platform and application levels, with ongoing monitoring and user feedback.

  • User agency
  • Transparency
  • Human review and oversight
  • Managing content risks
  • Ongoing monitoring
  • Defense in depth

These are all depicted in the diagram in the paper which is a very informative read.



References

Responsible AI Transparency Report How we build, support our customers, and grow

https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RW1l5BO

Thursday 2 May 2024

Responsible AI – A Data Governance Approach

I am speaking at the Bath Azure User Group meeting about Responsible AI - a Data Governance approach. I see Responsible AI a subset of Data Governance. This session covers where we are with legislation and tools, why good data quality is a must for AI and how to get started. 

Data Governance and Responsible AI, and the embellishment of AI within Microsoft Purview aid and prepare business for using AI. Moving forward I believe that combining the use of both Data Governance and Responsible AI into one actionable framework that  it will bring immediate rewards to every business use case.

Hope you can join us join us 22 May 2024 18-20 in Bath

https://lnkd.in/eRT8RijE 



Monday 29 April 2024

Open Lakes



This is an insightful article entitled Open Lakes, Not Walled Gardens by Raghu Ramakrishnan and Josh Caplan.  

The Fabric design principles consider the 

Open Ecosystem

Ensuring there are no proprietary barriers to data in OneLake, allowing integration with other services.

Security and Governance 

Data in OneLake must be secure and governed, integrating with Microsoft Purview for global policies.

Creating accessible data with no Silos 

Making the entire data estate easily accessible in OneLake without unnecessary data duplication.

SaaS Simplicity

Providing a suite of analytic engines in a secure, governed environment with single sign-on.

The article discusses the concept of open lakes for analytics, emphasizing the need for a unified view of data across an enterprise’s data estate to draw true insights. The advancements in big data tools, cloud storage, machine learning, and AI models, which offer opportunities to analyze core assets and processes through data in the Golden Age of Analytics.

The Microsoft implementation of the open lake vision with OneLake and Fabric focuses on data storage, analytics, sharing, and governance integrated with Microsoft Purview for data estate-wide governance. It outlines the importance of securing and governing enterprise data, detailing how OneLake and Fabric address these needs with built-in features and integration with Microsoft Purview for global data estate governance.

Governance for the organization, estate-level, and policy enforcement and sharing of data is a core tenant. Governance within Fabric and Onelake covers organizational governance, Estate-Level Governance where Microsoft Purview provides a global view of the entire data estate, offering a central catalog for all assets across all sources, global policies to secure sensitive data, and support for managing critical data risks and regulatory compliance. Policy Enforcement and Data Sharing are also discussed. 

Thursday 25 April 2024

Data Governance, Data Ethics and Responsible AI video series

I wanted to be able to share some thoughts on 3 of my favourite topics, Data Governance, Data Ethics and Responsible AI. There are many tools that help frame the subject area, from a data management perspective and there are useful Microsoft Tools to help you down the responsible AI and Governance route. There is a wealth of information available and wanted to, in under 5 mins a video, empower people to quickly have useful tips to move forward in this important space.  So it is an easily digestible series that is time efficient, has standalone content with an overall theme.
  • Data Governance to help govern and manage that data to improve trust and data quality 
  • Data Ethics to help mitigate issues with data integrity and provenance
  • Responsible AI to look a bias, fairness and efficacy in decisions

Episode 1 Introduction

Episode 2 what is data governance

Episode 3 what is data ethics

Episode 4 What is Responsible AI

Episode 5 Responsible AI Tools Microsoft Standard v2

Episode 6 Responsible AI Tools Impact Assessment and guide

Episode 7 Responsible AI Tools HAX Toolkit

Episode 8 Responsible AI Tools Maturity Model

Episode 9 The EU Act

Episode 10 UK Government Assurance

Episode 11 Content Safety

Episode 12 Responsible AI Dashboard

Watch this space as the next set of videos will cover how this fits in with data quality and how Microsoft Purview can help with data preparation.

The Age of Data Governance

Microsoft Purview is rapidly changing in the data governance space.  It is offering Data value creation with essential defense & response offense . This new addition helps business address the issues that the AI outputs are only as good as the quality of the data that resides behind it.

Peter Aiken new definition of data governance ' Managing data decisions with guidance’.  


Suma Manohar has written a great article talking about data quality in the era of AI.  Microsoft purview introduced domain and data products adding that clear business context and terminology mapping.  Enhanced search capability to provide more understanding using Copilot is available. It also can help with suggesting Data Quality rules.  These autogenerated rules are context specific.

Creating data quality rules manually in Purview should follow the 6 standard data quality metrics.

  • Freshness – confirms that all values are up to date.
  • Duplicate rows- checks rows to find repeated values across two or more columns.
  • Empty/blank files – looks for blank and empty fields in a column where there should be values.
  • Unique values – confirms that values in a column are unique.
  • Data type match – confirms that values in a column match data type requirements.
  • String format match – confirms that text values in a column match a specific format or other requirements.
  • Table lookup – confirms that a value in one table can be found in a specific column of another table
  • Custom – create a custom rule with the visual expression builder.
  • Regular expressions can be used for pattern matching in the above.

When working on data quality there are standard guidelines that can help. A method I use is firstly from the DAMA-DMBOK and then the Data Management Capability Assessment Model (DCAM)

Scans take place to show quality score and  trends in the data quality dashboard and scores are shown on the data product page

The rollout of the new solution across the regions is shared here.

Tuesday 9 April 2024

Fabric Mirroring Overview

There was a new feature announced last year that has been developing called Mirroring in Fabric which became Public Preview in March 2024.  This enables bringing your databases into Fabric.


Fabric mirroring is a feature within Microsoft Fabric that allows for seamless and real-time data replication from various databases into a centralized analytics platform known as OneLake. This process is designed to be frictionless, eliminating the need for complex Extract, Transform, Load (ETL) pipelines, which are traditionally used to move and transform data from one system to another.

The primary advantage of fabric mirroring is its ability to provide near real-time insights by continuously updating the data in OneLake as changes occur in the source databases. This uses Change Data Capture (CDC) technology, to capture and replicate data changes to OneLake to ensure the data is always current and synchronized.

By mirroring data into OneLake, organizations can break down data silos and unify their data estate, allowing for more efficient data governance and analysis. The data which has been mirrored can be used for analytics with ease to perform various analytical tasks.

Fabric mirroring simplifies the data access process by allowing databases to be securely accessed and managed within Fabric without the need to switch database clients or install additional software. It is possible for a mirrored database to be cross joined with other databases, warehouses or lakehouses whether that be data in Azure Cosmos DB, Azure SQL DB, Snowflake, etc.   

In summary, fabric mirroring is a transformative feature that streamlines data replication and analysis, providing businesses with a modern, fast, and safe way to access and ingest data, thereby accelerating the journey to valuable insights and informed decision-making.

Further Reading

https://blog.fabric.microsoft.com/en-US/blog/announcing-the-public-preview-of-database-mirroring-in-microsoft-fabric/

https://learn.microsoft.com/en-us/fabric/database/mirrored-database/overview

https://aka.ms/FabricRoadmap

https://aka.ms/MirrorSQLDBPublicPreviewBlog

https://devblogs.microsoft.com/cosmosdb/public-preview-mirroring-azure-cosmos-db-in-microsoft-fabric

Unify your data across domains, clouds, and engines in OneLake

Wednesday 3 April 2024

Microsoft Purview Fabric announcements

There were a number of announcements at the Microsoft Fabric Community Conference including the new Microsoft Purview for modern data governance was shared.  With business moving towards federated governance models, managed by line of business to help with more local understanding and increasing volumes of data, Microsoft have launched in Purview the capability for organizations to create subdomains to refine the way the data estate is structured in Fabric. Security has also become easier with the ability to set security groups for default domains

Microsoft Fabric is now natively integrated with Microsoft Purview Data Governance solution. There is a reimagined data governance experience for the data estate governance practice. The new experience includes data curation, an important new feature including data quality with insights. The new experience is available in preview 8 April 2024. This new experience is aiming to help accelerate measurable business value with key results, simplification and to help with implementing efficiency with natural language recommendations. 

Purview enables business terminology linkage to 

  • Data Products (a collection of data assets used for a business function) 
  • Business Domains (ownership of Data Products) 
  • Data Quality (assessment of quality) 
  • Data Access, Actions 
  • Data Estate Health (reports and insights)

A really exciting new feature we have all been waiting for is the data quality capabilities.  The is now the Data Quality model to set rules top down with business domains, data products, and the data assets. The model generates data quality scores at the asset, data product, or business domain level from the policies on terms or rules.  The score rules show on the dashboard as red/yellow/green indicator scores. The 2 capabilities in this data quality model are:

  • Profiling—quick sample set insights 
  • Data quality scans—in-depth scans of full data sets

It is great to see the Microsoft Purview continues to align to the EDM Council set of 14 rules. 

There is now an actions centre showing the current health summarising actions by role, data product or business domain for governance. This actions centra aims to help improve governance posture for the business. 

There is partnership with Ernst & Young LLP who will share playbooks and reports for US financial services customers on Azure Marketplace, throughout the preview. 


In summary there is a shift away from traditional IT-centric data architecture to federated architectures such as data mesh. The automated way to deal with Data Quality is a game changer for business. 

References

Announcements from the Microsoft Fabric Community Conference

Easily implement data mesh architecture with domains in Fabric

Introducing modern data governance for the era of AI 

The foundation for responsible analytics with Microsoft Purview

Watch: The Unified Data Platform for the Era Of AI | Microsoft Fabric Community Conference Day 1 Keynote

Crash Course in Microsoft Purview (azureedge.net)

Learning

Monday 1 April 2024

Responsible AI dashboard training

There is a new MSLearn course to Learn how to debug an AI model using the Responsible AI dashboard in Azure Machine Learning studio to ensure it performs responsibly and is less harmful. It is important to understand and learn how to use the dashboard to set any projects up for success.

Train a model and debug it with Responsible AI dashboard

The objectives are 

  • Create a responsible AI dashboard.
  • Identify where the model has errors.
  • Discover data over or under representation to mitigate biases.
  • Understand what drives a model outcome with explainable and interpretability.
  • Mitigate issues to meet compliance regulation requirements.

You do need the ability to understand beginner level Python.





Saturday 30 March 2024

The Fabric Conference 2024

The first Microsoft Fabric Community Conference, took place from 26 to 28 March 2024, at the MGM Grand in Las Vegas, Nevada.  It was an in person only conference and no sessions were recorded or streamed.  Great to see so many back to in person conferences, although for those not able to attend it means limited learning. 



The conference had more than 130 sessions covering various aspects of Microsoft Fabric from data warehousing to data movement, AI, real-time analytics, and business intelligence.

The Microsoft Intelligent Data Platform incorporates Microsoft Fabric, a suite of technologies that empowers organizations to harness the full power of their data. By natively integrating products across four critical workloads AI, analytics, database, and security, organizations can innovate without limits. The great advantage of fabric is that it brings together disconnected services from multiple vendors to  focus on accelerating transformation.

The four core promises of Fabric:

  • Fabric is a complete platform
  • Fabric is lake-centric and open
  • Fabric can empower every business user
  • Fabric is AI powered

There was a huge number of announcements that represent just the start of the innovation to Microsoft Fabric platform.  The full set of announcements are here. I will share separately about all the Purview announcements as these will add the depth we need to drive forward with AI. A number of other features I will blog about separately as they change how Fabric is growing. 
Mirroring in Fabric is a great addition to help with the data warehouse journey to Fabric.  
Announcing the Public Preview of Mirroring in Microsoft Fabric

Create folders and sub folders in workspaces and being able to tag Fabric items in futures will be a huge plus for compliance.
Announcing Folder in Workspace in Public Preview

Microsoft Fabric has a release plan that is documented.

Next events to look out for

Microsoft Build from 21-23 May 2024 is either in person in Seattle, Washington, or online. 

PASS Summit Community Conference  4-8 November 2024

Excited that there will be a second Fabric Conference next year 1-3 April 2025 at MGM grand, Las Vegas https://aka.ms/FabCon25

Wednesday 27 March 2024

Responsible AI Day at Microsoft

Today I took the opportunity to attend a Microsoft UK  Partner Responsible AI day at TVP in Reading. Thank you to Robin Lester and the RAI team for putting on an informative day of sessions.  I also got the opportunity to speak to Claire Dugan, a Responsible AI Advocate at Microsoft UK to discuss Governance and AI.

There are several places to get started with learning about the tools that are available to create impact assessments for projects.

FOUNDATION GUIDES 
It is necessary to adopt principles to create safe and explainable systems to ensure fair, transparent and safe systems are designed and deployed.  More details can be found




Monday 25 March 2024

The SQLBits 2024 Event

SQLBits 2024 was amazing as ever. The organisers creating another well choreographed event. It was held in Farnborough, the birthplace of British aviation. 

There was a huge number of tracks including all types of sessions.  

The Sessions 

The agenda covered various types of sessions

  • Tuesday Training Day
  • Wednesday 100 minute sessions to gain more depth into a variety of areas
  • Thursday General Sessions Day One
  • Friday General Sessions Day Two
  • The Free Saturday

SQLBits Extra Events

There was a wide range of extracurricular things to get involved with to enable a different slant on networking:

  • Meet the Trainer Monday 18th March, 6.30pm, The Aviator Hotel
  • Welcome Drinks and Burgers & Board Games Night Wednesday 20th March, 6pm
  • Ask the Experts Wednesday 20th March - Saturday 23rd March
  • The SQLBits Run Wednesday 20th March, 6pm & Friday 22nd March, 6am
  • User Group Bonus Sessions Thursday 21st March, 6pm
  • The Pub Quiz Thursday 21st March, 7.30pm
  • The Friday Night Party Friday 22nd March, 7.30pm

The Keynote

This was delivered by a number of speakers.


SQLBits announcements


Public Preview:  Managed Instance General Purpose Next-Gen; Migration Assessment in Azure Arc; Database Watcher.

Private Preview : T-SQL Regex; Copilot in Azure SQL Database



Learn more: 

Introducing Azure SQL Managed Instance Next-gen GP

Introducing database watcher for Azure SQL

Azure SQL migration assessment enabled by Azure Arc

Introducing Copilot in Azure SQL Database


Sunday 24 March 2024

SQLBits Buddies 2024

I was part of a team of helpers at SQLBits who are Bits Buddies. We are all experienced helpers and have attended lots of SQLBits. We are dedicated to help attendees who might want a bit of extra company and support, whether it’s a persons first time at the event or a regular attendee at the event.

We ran pre event meet up opportunities in the run up to SQLBits. The weekly drop ins for delegates and those interested in attending for an informal chat about the experience of attending SQLBits and to make connections before the event. It was nice to meet a few people before the event. This year the bits buddies wore orange hats so we could be seen easily around the venue.  It was really nice to speak to people at the event and help with questions. Till next year.



Saturday 23 March 2024

Data Toboggan Slide Preparation at SQLBits

This year SQLBits Thursday 21 March 2024 , added User Group Bonus Sessions.  It was announced as

After the main sessions, two UK user groups are running sessions that you’re welcome to join:

-  London Fabric User Group - SQL Bits Special - Ask Me Anything Panel (Gate 4)

-  Data Toboggan - Ask the Fabric Experts (Gate 1)

Running from 18:00 - 19:00, each with a panel of experts ready to answer questions or discuss hot topics.

Data Toboggan Slide Preparation ran its FIRST in person User Group hosted by Richard Munn at SQLBits.

The panelist were  Richard Munn, Dr Victoria Holt,  Cathrine Wilhelmsen, Mark Pryce-Maher, Emilie Rønning and Andy Cutler taking questions from the audience.

James Reeves reported 

'Data Toboggan User Group Celebrates First In-Person Meetup

🎉 The Data Toboggan user community recently celebrated an exciting milestone: their first-ever in-person meetup. After connecting and collaborating online, members finally had the chance to gather face-to-face and connect with fellow data enthusiasts who share a passion for uncovering insights through data.

💡 A highlight of the gathering was a session focused on Microsoft Fabric, a comprehensive analytics and data platform. Attendees engaged in a lively discussion about how tools like Microsoft Fabric are revolutionizing the field of data analytics and shaping the future of the industry.

🙌 The organizers expressed their gratitude to everyone who made the meetup possible and to all who participated. The energy, enthusiasm, and sense of community at the event were truly remarkable. They look forward to more opportunities for the Data Toboggan user group to connect, both virtually and in person.' 



Saturday 16 March 2024

MVP Global Summit 2024

MVP Summit took place in person in Seattle or virtually this year from 12-14 March 2024.  I attended virtually this year due to SQLBits being the following week.  It was good to catch up with people and engaging in learning and sharing thoughts on new technology. It is always an amazing privilege to be a part of this community, that continually share knowledge with the community to help everyone grow and learn. The image of me was created using the prompt below in Microsoft designer.





Friday 8 March 2024

Data Toboggan Slide Preparation: Ask The Fabric Experts at SQLBits


We are really excited to announce that the Data Toboggan user group Slide Preparation will be holding its first ever 'in-person' event at #SQLBits thanks to their amazing community ethos.
Date: Thursday 21 March 2024

If you're going to be at SQL Bits for the conference anyway just come along, but if you're not, there's a 'User Group Only' option on the SQL Bits event registration page - just select 'User Group Attendee - Thursday Evening', fill in your details, and select the Data Toboggan event on the next page. Registration is at https://events.sqlbits.com/2024

We’re thrilled and thankful to SQL Bits for letting us put on our first ever in-person meeting ! Join us for an ‘Ask Us Anything’ panel session - we’ve got some great contributors lined up and ready to answer your questions on anything #MSFabric, from Data Engineering to Governance, we’ve got you covered. 



International Womens Day 2024

 

International Women’s Day 2024, celebrates women’s achievements, progress, and equality.  The official campaign theme for International Women’s Day 2024 is 'Inspire Inclusion'. When we inspire others to understand and value women’s inclusion, we create a better world.

Historical Roots

The first International Women’s Day (IWD) was held in March 1911.

IWD transcends borders, organizations, and groups—it’s a day of collective global activism.

World-renowned feminist Gloria Steinem once emphasized that the struggle for equality belongs to all who care about human rights.



Investing in Women: Accelerating Progress

The overarching theme for 2024 is 'Invest in women: Accelerate progress'. It underscores the importance of creating an inclusive society and empowering women. IWD celebrates the social, economic, cultural, and political achievements of women.

The idea of IWD traces back to the 1908 labour movement in New York. Women garment workers marched, demanding better pay, shorter working hours, and voting rights. The movement was spearheaded by the Socialist Party of America.

Just a few of the remarkable women who have made significant contributions to science and technology:

Ada Lovelace

Born in 1815, she was the world’s first computer programmer. Collaborated with Charles Babbage on the Analytical Engine,  creating the first algorithm intended for implementation on this early mechanical computer.

Grace Hopper

She was a trailblazing computer scientist who invented the compiler. Her work led to the development of the high-level programming language COBOL, which revolutionized software development and paved the way for modern programming languages.

Tiera Guinn

A 21-year-old aerospace major at MIT. Working on building a powerful rocket for NASA. Inspiring others with her determination and vision.

Marie Curie

Pioneered research in radioactivity. The first woman to win the Nobel Prize (jointly with her husband) in 1903.

Elizabeth Blackwell

First woman to graduate from medical school in the US. She founded a medical school for women in England.

Dr. Mae C. Jemison

First African American woman in space. She holds degrees in chemical engineering and medicine. She served as a Peace Corps medical officer.

Caroline Herschel

Caroline Lucretia Herschel born in 1750 was a German-born British astronomer, whose most significant contributions to astronomy were the discoveries of several comets, including the periodic comet 35P/Herschel–Rigollet, which bears her name

Williamina Fleming

Cracked the secrets of the universe with computation.

Worked at the Harvard College Observatory in the late 1800s. These women have left an indelible mark on science and technology, inspiring generations to come.

Friday 1 March 2024

Mirroring in Microsoft Fabric

Mirroring in Fabric was announced  in November as coming soon.  When i first heard the term I immediately thought of the deprecated SQL Server Database mirroring term. However the summary from Ignite shared on the MSSQLTips site



There are a few capabilities announced so far

Real-Time Data Replication

No complex setup or ETL processes. Data is replicated reliably and in real-time.

An initial snapshot captures the data, followed by continuous synchronization with every transaction (inserts, updates, deletes).

Mirroring uses Change Data Capture (CDC) technology, transforming it into appropriate Delta tables and landing it in OneLake.

Intelligent logic ensures efficient replication without unnecessary compute usage.

Access and Management

Any database can be accessed and managed centrally within Fabric.

By providing connection details, your database becomes instantly available as a Mirrored database.

Familiar database editors allow seamless management.

Data Warehousing Simplified

Each Mirrored database includes default data warehousing experiences via a SQL Analytics Endpoint.

Whether a SQL developer or a citizen developer, querying is easy using the T-SQL editor with full Intellisense or the visual query editor.

What's next

Initially the article proposes Azure Cosmos DB, Azure SQL DB and Snowflake will be able to use mirroring. You can read more here.

Wednesday 21 February 2024

EU AI Act, the first extensive AI regulation globally, is approved

The European Union (EU) has been working on a new legal framework that aims to regulate the development and use of artificial intelligence (AI) in the EU. The proposed legislation, the Artificial Intelligence (AI) Act, focuses on ensuring that AI systems are trustworthy, respect human values and rights, and support the EU single market.

The AI Act introduces a risk-based approach to classify AI systems into four categories: unacceptable, high-risk, limited-risk, and minimal-risk. Unacceptable AI systems are those that pose a clear threat to the safety, livelihoods, or rights of people, such as social scoring or mass surveillance. High-risk AI systems are those that are used in critical sectors, such as healthcare, education, or law enforcement, and have a significant impact on people’s lives, such as medical devices, recruitment tools, or facial recognition. Limited-risk AI systems are those that pose some risks to people’s rights or expectations, such as chatbots, online advertising, or deepfakes1. Minimal-risk AI systems are those that pose no or negligible risks to people, such as video games, spam filters, or smart appliances.

The AI Act imposes different obligations and requirements for each category of AI systems. Unacceptable AI systems are banned from being developed, sold, or used in the EU. High-risk AI systems must comply with strict rules on data quality, transparency, human oversight, accuracy, security, and accountability. They must also undergo a conformity assessment before being placed on the market or put into service. Limited-risk AI systems must provide clear and adequate information to users about their nature, purpose, and capabilities. Minimal-risk AI systems are subject to voluntary codes of conduct and best practices.

The AI Act also establishes a governance structure and a cooperation mechanism for the implementation and enforcement of the rules. The European Commission will be responsible for monitoring and updating the list of high-risk AI systems and sectors, as well as adopting delegated and implementing acts. The European AI Board will be an independent advisory body that will provide guidance and recommendations to the Commission and the member states. The national competent authorities will be in charge of supervising and sanctioning the compliance of AI systems with the rules, as well as ensuring cross-border cooperation.

The AI Act is a landmark proposal that aims to make the EU a global leader in trustworthy and human-centric AI. However, it also faces some challenges and criticisms from various stakeholders, such as industry, civil society, and other countries. The AI Act will need to balance the interests and concerns of different actors, as well as adapt to the fast-changing and evolving nature of AI

There is a pyramid of risk.



References

https://www.weforum.org/agenda/2023/06/european-union-ai-act-explained/

https://www.bbc.com/news/world-europe-67668469

https://cset.georgetown.edu/article/the-eu-ai-act-a-primer/

https://www.finextra.com/the-long-read/847/what-is-the-eu-ai-act-understanding-europes-first-regulation-on-artificial-intelligence

Thursday 15 February 2024

Fundamentals of Generative AI

 Generative AI is a branch of artificial intelligence that focuses on creating new content or data from scratch, such as images, text, music, or code. Generative AI models learn from existing data and use it to generate novel and realistic outputs that are not part of the original data. Some of the applications of generative AI include:

Image synthesis: Generative AI can create realistic images of faces, landscapes, animals, or objects that do not exist in the real world. 

Text generation: Generative AI can produce natural language texts on various topics, such as stories, poems, essays, or code. 

Music composition: Generative AI can compose original music in different genres, styles, and moods. 

Data augmentation: Generative AI can enhance or expand existing data sets by creating new samples that are similar but not identical to the original ones. This can help improve the performance and robustness of machine learning models. For example, generative AI can create new images of handwritten digits or new sentences of natural language.

The main challenge of generative AI is to ensure that the generated outputs are both diverse and realistic, meaning that they cover a wide range of possibilities and resemble the real data. To achieve this, generative AI models often use two types of techniques:

Probabilistic models: These are models that learn the probability distribution of the data and sample from it to generate new outputs. For example, variational autoencoders (VAEs) are probabilistic models that encode the data into a latent space and decode it back into the original space, adding some noise in the process to create variations.

Adversarial models: These are models that consist of two components: a generator and a discriminator. The generator tries to create outputs that fool the discriminator, while the discriminator tries to distinguish between real and fake outputs. The two components compete with each other and improve over time. For example, generative adversarial networks (GANs) are adversarial models that use neural networks as the generator and the discriminator.

Generative AI is a fascinating and rapidly evolving field of artificial intelligence that has many potential benefits and applications for society. However, it also poses some ethical and social risks, such as misuse, deception, or bias. Therefore, it is important to develop and use generative AI models responsibly and transparently, with respect for human values and rights. 

To learn more about the Fundamental of Generative AI , Microsoft Learn has a great course

It also covers what is the Azure OpenAI service. This being a Microsoft's cloud solution for deploying, customizing, and hosting large language models. There is a brief overview of  CoPilot.



Thursday 1 February 2024

Data Toboggan - Purview in Microsoft Fabric

Excited to be speaking at Data Toboggan 

Event Date: 3rd February 2024 

Register now for free: https://bit.ly/DT24-Register

Agenda: https://bit.ly/DT24-Agenda

Abstract

Microsoft Fabric comes with Purview for data governance. What does that mean and how can it help with managing your data estate. This session looks to connect the dots between the old and new and explains, which of the apps exist in Fabric.



Data Toboggan Winter Edition 2024

Please join us on Saturday for the #DataToboggan winter edition. We have 32 speakers, 3 tracks, including an AI track. Lots of fun and learning. We also have the amazing Knee-deep in Tech not to be missed and a keynote from Kim Manis

Event Date: Saturday 3rd February 2024 

Register now for free: https://bit.ly/DT24-Register

Agenda: https://bit.ly/DT24-Agenda





#azuresynapse #microsoftfabric #synapseanalytics #AI #artificialintelligence #copilot #openai 

Wednesday 31 January 2024

Generative AI framework for Government

The UK government has released their generative AI framework created by the Central Digital and Data Office V1.0.  This is public sector guidance with a focus on Large Language Models (LLMs). 

The framework outlines ten principles:

Principle 1: You know what generative AI is and what its limitations are

Principle 2: You use generative AI lawfully, ethically and responsibly

Principle 3: You know how to keep generative AI tools secure

Principle 4: You have meaningful human control at the right stage

Principle 5: You understand how to manage the full generative AI lifecycle

Principle 6: You use the right tool for the job

Principle 7: You are open and collaborative

Principle 8: You work with commercial colleagues from the start

Principle 9: You have the skills and expertise that you need to build and use generative AI

Principle 10: You use these principles alongside your organisation’s policies and have the right assurance in place

It defines Generative AI as a form of AI '– a broad field which aims to use computers to emulate the products of human intelligence or to build capabilities which go beyond human intelligence'

Then within Generative AI  how those public LLM's fit within the field

The framework has lots of information to draw on from  advocating for lawful, ethical, and responsible usage to addressing the the challenges of accuracy, bias and environmental impact.  Transparency and human control are paramount going forward. You can read more here.