Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Thursday, 16 May 2019

West Women Awards Ceremony

Proud to be listed in the top 100 most inspiring women in the region. The awards ceremony tonight in Bristol helps promotes diversity and inclusion in the workplace which is fundamental to providing an environment for innovation and to enable companies to lead. It also creates a culture that can enable everyone to aspire to follow their dreams.

Thursday, 9 May 2019

Monday, 6 May 2019

Google AI training data set

Google has released an AI training data set with 5 million images and 200,000 landmarks. The open-sourced Google-Landmarks-v2 contains a larger landmark recognition corpus. Google has also launched two new challenges Landmark Recognition 2019 and Landmark Retrieval 2019 on Kaggle.

Tuesday, 30 April 2019

Azure Open Datasets

Azure Open Datasets are curated public datasets that can be used to add scenario-specific features to machine learning solutions for more accurate models. Open Datasets are on Microsoft Azure and are available to Azure Databricks, Machine Learning service, and Machine Learning Studio. Access to the datasets is through the APIs and other products, such as Power BI and Azure Data Factory.

Sunday, 28 April 2019

Microsoft Build is coming

It is that time of year again and I am looking forward to see what announcements are going to be made at MSBuild 2019. MSBuild explores the latest developer tools and technologies.

Thursday, 25 April 2019

Spark+AI Summit 2019

The SparkAI Summit shared a lot of  announcements. The open source announcements were

Koalas - a more complete Pandas API

The open sourcing of Databricks Delta as Delta Lake. Delta dramatically simplifies building reliable data lakes on HDFS and cloud storage with ACID transactions, indexes and scalable metadata handling.

Microsoft is joining the MLflow project and adding MLflow APIs in Azure ML.

Rohan Kumar  of Microsoft announced .NET for Apache Spark, making Apache Spark accessible to .NET developers - Git Hub

Spark 3.0 expected later in the year

The keynote videos are all online now and other session videos will be there in about 2 weeks.

Saturday, 6 April 2019

Data in Devon

Data in Devon was previously SQL Saturday Exeter. It is a great community conference in the South West. It is at Jurys Inn Exeter, Western Way, Exeter, EX1 2DB and it is free to attend.There is a day of in depth technical training sessions.  Register for a Data in Devon Training Day session on Friday 26th April. The options are:
  • BI in Azure - Alex Whittles MVP
  • Infrastructure as Code with Terraform - John Martin MVP
  • Machine Learning: From model to production using the cloud, containers and Dev Ops -  Terry Mccann MVP
  • Getting up to speed with PowerShell -  Rob Sewell MVP
The Saturday schedule on 27 April also includes a track for the Global AzureBootcampThe Global Azure Bootcamps are all around the world for communities on 27 April that want to learn about Azure and the Cloud. This is the sixth Global Azure Bootcamp event. 

Monday, 1 April 2019

My First MVP Summit

I had an amazing time at my first MVP Global Summit.  The MVP Global Summit was hosted in Bellevue and at the Microsoft headquarters in Redmond, Washington. It featured a large catalog of in-depth technical discussions and feedback sessions combined with networking opportunities among fellow MVPs and the Microsoft product groups.
It was held the week of 17 March 2019. There were community pre-day sessions on Sunday 17 March and the product group technical sessions ran from Monday 18 March until Wednesday 20 March. On Thursday 21 March and Friday 22 March the Power BI and Azure product teams hosted additional sessions and workshops on campus.

As a first time attendee I didn’t know anything about the conference.There was finding out about transfers to and from the airport to the hotel, having a map of the Microsoft campus and the need to download a few apps such as Uber and the event mobile app to select the sessions I wanted to attend for my award category and for conference updates when I was there.

The conference hotels were in Bellevue and I stayed in the main hotel, the Hyatt Regency. The sessions on the Sunday and a number of evening events were all held in the Hyatt.  

Every day there were buses that took MVPs from the hotels to the main conference centre at Redmond. These ran regularly throughout the five days. From the conference centre there were transfer buses that could take you to any of the other buildings. Some of the building were within 5 mins walk. With the weather like summer it was a pleasant stroll through the tree lined undulating campus roads and paths between the buildings. By one of the buildings was the three tree houses which can be used for meetings.

The Microsoft Store was in another part of the campus. During the week I also had to travel to another building Advanta which was 15 mins away from the main campus.

The events on Sunday covered important topics such as diversity and inclusion and how to improve your presentation skills. It was a very helpful day to focus on soft skills. Also I was very grateful to have a new professional headshot taken for LinkedIn. Then five days of sessions with the different product groups. There was so much content to learn and absorb. In the evenings there was plenty of time to network with other MVPs and the product groups. One evening meetup was with the MVP leads, a couple were with the Data Platform product group on Tuesday and Thursday on campus. Then there was the main attendee celebration to celebrate one global community on Wednesday. Throughout the whole event there were opportunities to network, meet new people, catch up with people I knew and discuss data platform things. It was also a great opportunity to have a Data Relay team meeting. 

It is such an amazing privilege to be a part of this community with so many amazing people.

Saturday, 16 March 2019

Data Relay Session Submission is Open

It is that time of year again already. Data Relay session submission is open. There is a great blog post talking about Why to speak at Data RelaySubmit your sessions and start your journey.

Thursday, 14 March 2019

Inspirational West Women of the Year 2019

I have the privilege to have been chosen to be in the 'Top 100 Inspirational Woman in the West' for 2019.

West Women of the Year is to recognize the women in our region who continue to champion gender equality a hundred years on, in celebration of a century of women's suffrage in Britain. The award shares the stories of inspiring, dedicated and high achieving women from all walks of life who are making a difference in their workplaces and communities.The event webpage gives more information.

The articles about the top 100 Inspirational Woman in the West for 2019 were posted in the Bristol Post , Somerset Live and Gloucestershire Live

There are various categories and the winners of each category are selected by the judges apart from the people’s choice.

For one of the awards there is an online poll. That is for 'The People's Choice' category for the 2019 West Women Awards. The voting form is available here and it would be amazing to get some people to vote for me. I'm listed under Dr Victoria Holt. Please note that the poll will be open until Thursday 25th April 2019.

Sunday, 10 March 2019

Woman in Data Science (WiDS) Scotland

The Stanford University's Women In Data Science  (WiDS) initiative, The Data Lab, Turing's Testers along with their primary sponsor Mudano, have created an event that celebrates women, tech, innovation and codebreaking! This event brings together women data scientists and school girls to showcase what a data career looks like, and inspire the female data leaders of the future.  The aim is to inspire school girls to consider STEM and data related careers by bringing together the girls with inspiring women working in the field of data science and to expose them to some fun activities that are powered by data.  I have the privilege to attending the event, on behalf of my employer CGI, to participate as a mentor and to speak to the girls to share the wonders of working with data.

Women in Data Science is on 11 March 2019 at the National Museum of Scotland in Edinburgh. It is one of the fringe events of the UK’s first two week festival of Data Innovation in Scotland from the 11th to 22nd March 2019 and in its third year. DataFest will showcase Scotland's leading role in data science and artificial intelligence with networking from industry, academia and data enthusiasts. 

The Event details:

The Cyber Treasure Hunt has been created by Turing's Testers, a group of motivated pupils and STEM ambassadors; inspiring, engaging and supporting girls into the technology sector. This has been running over a number of months with codes being released every few weeks. For a school to gain invitation to this event they must crack the codes. This event will be the final code cracking session with the winner being announced at the end of the day.

The event will be broken into a number of sessions and vary between workshops and talks. Talks will be led by various female thought leaders from the world of tech, including none other than Hannah Fry.

Attendee spaces are limited at this event. We expect tickets to sell out quickly, however will have a waitlist and will inform you if you have secured a place. Attendees will have a hands on role as part of this event and will be asked to help out with mentoring and guidance to each of the groups throughout the day.

This event also plays part of DataFest, kicking off proceedings on the first day of the two week festival of data science.


10.00 - Registration
10.30 - First Rotation of Workshops and Talks
12.00 - Lunch
12.40 - Second Rotation of Workshops and Talks
14.10 - Prize Giving
15.00 - Closing Comments

Tuesday, 5 March 2019

International Women's Day

International Women's Day is fast approaching. It is celebrated on 8 March every year. It is a celebration of women globally. It is a chance to network, be inspired and share your stories to empower each other.
The Official UN theme for 2019 is Think Equal, Build Smart, Innovate for Change. The theme will focus on innovative ways to advance gender equality and the empowerment of women, particularly in the areas of social protection systems, access to public services and sustainable infrastructure.

I am excited to be contributing to a webinar for International Women's Day 2019. The webinar discussed women's roles in the field of technology.

Sunday, 3 March 2019

Tree of Learning sculpture

The Tree of Learning sculpture is a stunning celebration of the last 50 years of The Open University and an inspiration for the future. As part of the 50th Anniversary celebrations the “Tree of Learning” sculpture is being created and will be installed on campus, later in the Anniversary year. It will contain of hundreds of individually personalised gold-coloured OU logo-shaped shields hung as leaves on the tree.  
If you’ve studied with, worked with or been involved in some way with The Open University, it is a great way to still be a part of the amazing story to come. I am proud to have been a part of The Open University for over 13 years.

Saturday, 2 March 2019

SQL Bits 2019 Keynote

What an amazing SQLBits in Manchester. Four days packed full of leading edge data technology covering

  • SQL Server 2019 Big Data
  • Azure SQL Managed Database
  • Power BI
  • Kubernetes
  • Machine Learning
  • Python
  • Spark

This year SQLBits 2019 had a keynote.  It was nice for the event to have a keynote again. The theme Data Never Rests.  The Microsoft Data Platform Product group who spoke were Buck Woody, Bob Ward, Anna Thomas, Alain Dormehl, Adam Saxton and Patrick LeBlanc. An amazing set of speaks and fun keynote. They shared details of the evolution of the data platform to enable people to keep their skills up to date. The keynote is available to watch . There were several major announcements.

SQL Server 2019 will RTM in second half of the year. SQL Server 2019 CTP2.3 is available now with
  • Big data cluster enhancements
  • Accelerated database recovery
  • Performance enhancements
  • Graph data enhancements
  • SSAS enhancements

SQL Server 2019 is a modern innovation and there are various forms of the product.
  • On Premises
  • SQL Server Azure VM (IaaS)
  • Azure SQL DB Managed Instance (PaaS)
  • Azure SQL Data Warehouse

Azure SQL Database Hyperscale can autoscale up to 100TB and scale compute and storage independently.
During the keynote they showed Azure SQL Database Hyperscale where a 50TB database was restored in just under 8 minutes. That is nice accelerated database recovery.

Data virtualization and big data clusters is a game changing view with SQL Server 2019 big data clusters, data lake scale, machine learning and AI. Multiple data sources can be connected using external table, through the compute pool using Polybase connectors at the source.  Data persistence using multiple data sources is stored in shards of the data pool for SQL Server 2019 big data clusters data mart.

SQL 2019 will send push down predicated queries to other data platforms via Polybase to join SQL data with Oracle, Mongodb and CosmosDB data in one place efficiently.

SQL notebooks in Azure Data Studio is an awesome new feature. 

There is documentation to read and new courses for learning.


and a Summary of All Exams and Certifications Launched in January, 2019!


Friday, 1 March 2019

Data Relay 2019

Data Relay (formerly SQL Relay) announced their 2019 training conferences covering Microsoft Azure, Data, AI and Analytics. They are visiting the cities of Newcastle, Leeds, Nottingham, Birmingham and Bristol. There is a new website https://datarelay.co.uk/ to share all the latest news. The Twitter handle is @DataRelay_uk , the Linkedin group and Facebook page.

Thursday, 28 February 2019

Azure Data Architecture Guide

There is a useful guide to read which discusses the a structured approach for designing data-centric solutions on Microsoft Azure. The two different approaches are

Traditional RDBMS workloads.
These designs are for online transaction processing (OLTP) and online analytical processing (OLAP).

Big data solutions. This design looks at big data architecture to handle the ingestion, processing, and analysis of data that is too large or complex for traditional database systems. 

There is useful pages to read on machine learning at scale and non relational data

Wednesday, 27 February 2019

SQLBits 2019 The event

The first day of SQLBits 2019 at Manchester Central Convention Centre. What a great venue.

Monday, 25 February 2019

Monday, 11 February 2019

Data Trends for 2019

I created a survey question on Twitter to look at data trends. I was interested to see whether people felt that improving the quality of their data was more important than AI data ethics. Data quality is heavily influenced by data ingest so I added this as an option, as i felt it is often over looked, but is a foundation stone of good data quality. 

A few definitions:

Data Ethics describe a code of behaviour, specifically what is right and wrong, encompassing the following: Data Handling: generation, recording, curation, processing, dissemination, sharing, and use."  

Data Quality (DQ) as stated in the DAMA International, Data  Management Book of Knowledge  "Refers to both the characteristics associated with and to the processes used to measure or improve the quality of data.” Data is considered high quality to the degree it is fit for the purposes data consumers want to apply it."

Data ingestion is the process of obtaining and importing data  for immediate use or storage in a database. To ingest something is to "take something in or absorb something." Data can be streamed in real time or ingested in batches.”

Data ingestion tools provide a framework that allows companies to collect, import, load, transfer, integrate, and process data from a wide range of data sources.” 

The survey question had 267 votes.

What do you think will be the most important #Data trend for 2019 out of the following options?

In additions to the results above I received a few additional comments. 
  • Neither
  • The biggest thing in my opinion is just ethics. How is the data collected?
  • Also, what is it being used for. What are the impacts of high or low accuracy models.
  • Improving quality and ethics seem to me, to be related tasks
  • All of the above?

The results are quite interesting with AI Data Ethics and Improving Data Quality being the trends that the respondents thought were the most important.

Wednesday, 6 February 2019

Improved Microsoft Docs

A cool image from http://www.thinksinc.org/ about Microsoft Docs.

I was looking at the Microsoft Docs pages and its new design. I have found it is much easier to navigate which speeds up searching.

At the top of the page there are 3 helpful options 

  • Download SQL Server
  • Get an Azure VM with SQL Server
  • Download SQL Server Management Studio

Then the Microsoft SQL Documentation has 3 categories covering on premises and cloud.
  • SQL Server on Windows
  • SQL as an Azure Service
  • SQL Server on Linux
There are technology areas to drill down further.

Then a further collection of links to enable a deeper dive into the technology.

  • Design
  • Tools
  • Reference
  • Reporting
  • Data Analytics
  • AI and Machine Learning
I was looking for design documentation and the link takes you to a page with easy to select image and text.

Thursday, 24 January 2019

SQLBits 2019 is fast approaching

SQLBits 2019 is fast approaching. This year it is in Manchester 27 Feb - 2 March 2019. There is an informative article about The Great Data Heist. My insights on what to expect of the conference are here

There are some interesting training sessions on the Wednesday and Thursday to attend. These are

Wednesday 27 February 2019
with Alexander Klein and Gabi M√ľnster
with Itzik Ben-Gan
with Kalen Delaney
with Jason Horner
with Alberto Ferrari
with Mark Whitehorn and Kate Kilgour
with Kevin Kline, Richard Douglas, Andy Yun and Andy Mallon
Thursday 28 February 2019
with David Klee and Bob Pusateri
with Erik Darling
with Marco Russo
with Terry McCann and Simon Whiteley
with Theo van Kraay

I hope to see you there.