Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Thursday 19 October 2017

Machines that learn to see and move: The future of artificial intelligence

I attended the Institute for Mathematical Innovation (IMI) public lecture by Professor Andrew Blake, Research Director at The Alan Turing Institute on 18 October. Professor  Blake is a pioneer in the development of algorithms that make it possible for computers to behave as seeing machines. Before joining the Institute in 2015, Professor Blake held the position of Microsoft Distinguished Scientist and Laboratory Director at the Microsoft Research Lab in Cambridge, and he has been on the faculty at Oxford University. He is a part of a new startup FiveAI.

The session abstract:

Neural networks have taken the world of computing in general and artificial intelligence (AI) in particular by storm.

But in the future, AI will need to revisit these generative models which are used to make predictions. There are several reasons for this – system robustness, precision issues, transparency, and the high cost of labelling data.

This is particularly true for perceptual AI, needed for autonomous vehicles, where the need for simulators and the need to confront novel situations, will demand further development of generative, probabilistic models. 

He talked about the empirical detector and generative model. At the moment it is the era of deep learning and neural networks, that sit within the empirical detector area. A black box area of big data and optimal predictive power. The generative model is analysis by synthesis and comes with an ‘explanation’, like a model. It starts with a hypothesis, typically probabilistic. Professor Blake believes the generative model will come back as perceptual models need this. This is

  • to simulate labelled data
  • for data fusion - to increase reliability
  • to make detailed interpretations 
  • for online simulation - to explain hard to read situations
This was a very insightful lecture and very interesting to see the mention of analysis by synthesis. 

Monday 16 October 2017

Agilience Authority Index

I came across the Agilience Authority Index placing me in the top 250 for SQL Server.

The Agilience Authority Index shows how influential you are and looks at your twitter profile. Agilience state your influence is more than your audience, your influence is your recognized expertise on a topic. Your profile on agilience.com shows your main topics of influence based on the Agilience Authority Index. 

PhD Thesis

My PhD thesis is now available online.

Holt, Victoria (2017). A Study into Best Practices and Procedures used in the Management of Database Systems. PhD thesis The Open University.

Saturday 14 October 2017

SQL Relay 2017

SQL Relay took place between 9 – 13 October 2017. It was the end of my first year of being on the organising committee which has been great fun. 

The relay begin in Reading, moving to Nottingham, Leeds, Birmingham and ending Bristol. This year I helped out on site at 2 events Reading and Bristol. The event brought 4 tracks, 3 general tracks and a workshop track, to each venue. With only 1 hour in the morning before the event starts to set up, it is all hands on deck to get all the attendees registered and the event starting on time. We were lucky to have so many amazing volunteers who helped during the days and without sponsors and speakers we wouldn’t have been able to run the events. I became the event lead for Bristol and it  was nice to be able to bring the event back to the city this year. We enabled around 1000 people to be trained, enabled the SQL community to grow and for people to learn something new. It is a privilege to be a part of this unique event.  

Friday 6 October 2017

Machina Summit.AI

I attended IPExpo Europe 4-5 October in ExCel in London with the specific attendance at the Machina Summit.AI.

The opening keynote was by Professor Brian Cox OBE on ‘Where IT & Physics Collide’.  The talk interlinked big data, quantum mechanics and quantum computing. The whistle top tour mentioned the Sloan Digital Sky Survey, which are the most detailed three-dimensional maps of the universe; general relativity; history of space and time; the theory of cosmology; and quantum mechanics ending with quantum theory and predicting the distribution of galaxies. This was an amazing talk and gave a glimpse of the interconnected future.

This was followed by Brad Anderson, Corporate Vice President of Microsoft on ‘Business as usual in a digital war zone’. We live in turbulent times with a 300% increase in user account attacks this year, 96% of malware is automated polymorphic which costs business $15 million. Attacks happen in increasing waves and old defences never stand up against these attacks. In this intelligent war you need an intelligent graph. He introduced the Microsoft Azure Active Directory service as the new control plane. There is the need to eliminate false positives, classify email and guarantee data never leaves the browser and be able to use a real time evaluation engine.

A few other talks covered the practice of monitoring with machine data. There are 2 types of monitoring, transitional IT and the new data driven IT. For the latter there is the need to rethink and improve how IT operates using machine learning to be proactive. Organizational silos and increasing quality are things that need to be broken down to be able to address the velocity data in a more agile way to produce actionable insights.

Conrad Wolfram, Strategic Director, Wolfram Research talked about ‘Enterprise computation: the next frontier in AI and data science’ Todays data challenge is about accessibility of data, personalisation of data and providing insightful answers. Data Science is multi paradigm and machine learning does not have all the answers. Computation is required for everyone with smart automation and computational thinking is needed for everyone. Data science needs to be personalised, multifaceted but unified.

The day 2 keynote was given by Stuart Russell, Professor of Electrical Engineering and Computer Science, University California Berkeley on ‘Human-Compatible AI’. He discussed what is coming soon. Basic language understanding with web-scale question answering and intelligent assistants for health, education, finances and life (not chatbots!!). Robots for unstructured tasks (home, construction, agriculture) and new tools for economics, management and scientific research. He discussed the premise that eventually AI systems will make better* decisions than humans. Well *taking into account more information and looking further into the future. He argued that for the case of super intelligent AI, that you can’t switch off the machine and AI will never succeed.

Other sessions discussed the journey of chaos and how everything fails all the time. To address this there is the need to consider that every journey begins with a single step. There is the inevitable question to consider skills versus knowledge and that is practice.

Microsoft talked about their 'AI and Analytics in the Enterprise'. There is now a need to look at more than the rear view mirror, to see what happened. There is a convergence of cloud, data and AI. With that Microsoft have created an AI platform that is fast and agile, with AI built in and enterprise proven for on-premises to edge to create insights. The evolution of the data state takes into account increasing data volumes, new data sources and types and open source languages. There are 3 stages between the heterogeneous sources and providing apps and insights.
  • Ingest – data orchestration and monitoring
  • Store – Data Lake and storages
  • Machine learning – preparations and train ( Hadoop / spark / SQL and ML) then model and serve (on-prem, Cloud, IoT).

In summary the 2 day conference provided great insight into many new technical areas and raised thought provoking questions about the future of data and AI. 

Sunday 1 October 2017

Microsoft for the Modern Data Estate

The Microsoft Ignite session on the modern data estate was full of announcements. All around us, data is driving digital transformation. Modernize with SQL Server 2017 on Linux and Windows and Azure Data Services; deliver modern intelligent applications using technologies like Azure Cosmos DB and Azure Database for PostgreSQL.

The world is changing.  We need to help invest in the future without being tied to the past. AI is a fundamental pillar to leverage that. If businesses invests in data they outperform other companies. 

Data doesn’t need to leave the database for data science to take place.

The cloud first approach breeds faster innovation and SQL Server 2017 is proof of that.

SQL Server 2017 on Linux, Docker , and Windows server
Supports for graph data and queries
Advances Machine Learning with R & Python
Native T-SQL scoring
Adaptive Query processing and Automatic Plan Correction
Vulnerability assessment for GDPR - preview
Intelligent insights into performance – preview
Support for Graph data and queries –GA
Adaptive query processing – GA
Native scoring and support for Azure Machine Learning - GA

SQL Database and Database Migration Service
Migration to the cloud is easy with this new service. 
Azure SQL Database is the intelligent cloud database for app developers. It learns and adapts, scales on the fly, enables multi-tenant SaaS apps, works in your environment, secures and protects. The systems of intelligence on SQL Database is shared.

Globally Distributed Applications

Announcing Azure functions for Azure Cosmos DB to build apps faster with a serverless infrastructure. 

Uncovering insights with big data and advanced analytics

The new Azure Data Factory allows easy modelling of diverse data integration scenarios. You can now with the preview service, easily move your SQL Server Integration Services (SSIS) workloads to cloud. There is also a data movements as a service with 30+ connectors. Azure SQL Data Warehouse has a compute- optimized tier and unlimited columnar storage in preview. The last announcement was the Power BI Report Server.