Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Monday, 11 February 2019

Data Trends for 2019


I created a survey question on Twitter to look at data trends. I was interested to see whether people felt that improving the quality of their data was more important than AI data ethics. Data quality is heavily influenced by data ingest so I added this as an option, as i felt it is often over looked, but is a foundation stone of good data quality. 

A few definitions:

Data Ethics describe a code of behaviour, specifically what is right and wrong, encompassing the following: Data Handling: generation, recording, curation, processing, dissemination, sharing, and use."  

Data Quality (DQ) as stated in the DAMA International, Data  Management Book of Knowledge  "Refers to both the characteristics associated with and to the processes used to measure or improve the quality of data.” Data is considered high quality to the degree it is fit for the purposes data consumers want to apply it."

Data ingestion is the process of obtaining and importing data  for immediate use or storage in a database. To ingest something is to "take something in or absorb something." Data can be streamed in real time or ingested in batches.”

Data ingestion tools provide a framework that allows companies to collect, import, load, transfer, integrate, and process data from a wide range of data sources.” 

The survey question had 267 votes.

What do you think will be the most important #Data trend for 2019 out of the following options?













In additions to the results above I received a few additional comments. 
  • Neither
  • The biggest thing in my opinion is just ethics. How is the data collected?
  • Also, what is it being used for. What are the impacts of high or low accuracy models.
  • Improving quality and ethics seem to me, to be related tasks
  • All of the above?


The results are quite interesting with AI Data Ethics and Improving Data Quality being the trends that the respondents thought were the most important.

Wednesday, 6 February 2019

Improved Microsoft Docs


A cool image from http://www.thinksinc.org/ about Microsoft Docs.

I was looking at the Microsoft Docs pages and its new design. I have found it is much easier to navigate which speeds up searching.

At the top of the page there are 3 helpful options 

  • Download SQL Server
  • Get an Azure VM with SQL Server
  • Download SQL Server Management Studio


Then the Microsoft SQL Documentation has 3 categories covering on premises and cloud.
  • SQL Server on Windows
  • SQL as an Azure Service
  • SQL Server on Linux
There are technology areas to drill down further.

Then a further collection of links to enable a deeper dive into the technology.

  • Design
  • Tools
  • Reference
  • Reporting
  • Data Analytics
  • AI and Machine Learning
I was looking for design documentation and the link takes you to a page with easy to select image and text.























Thursday, 24 January 2019

SQLBits 2019 is fast approaching

SQLBits 2019 is fast approaching. This year it is in Manchester 27 Feb - 2 March 2019. There is an informative article about The Great Data Heist. My insights on what to expect of the conference are here

There are some interesting training sessions on the Wednesday and Thursday to attend. These are

Wednesday 27 February 2019
with Alexander Klein and Gabi M√ľnster
with Itzik Ben-Gan
with Kalen Delaney
with Jason Horner
with Alberto Ferrari
with Mark Whitehorn and Kate Kilgour
with Kevin Kline, Richard Douglas, Andy Yun and Andy Mallon
Thursday 28 February 2019
with David Klee and Bob Pusateri
with Erik Darling
with Marco Russo
with Terry McCann and Simon Whiteley
with Theo van Kraay

I hope to see you there.

Friday, 18 January 2019

Data Science Activities linked to Business

Managing Big Data, AI and Data Science all need new processes and methods to be efficient. I am always on the look out for new tools that help refine my thinking and usage.  William Schmarzo shared the Hypothesis Development Canvas as tool to connect data science to the organization. It is to be used to develop business hypothesis.


He also shared thinking like a data scientist process.























Monday, 14 January 2019

The AI Journey

The AI Journey is a interesting blog post that discusses the pragmatic approach to AI and use, the pattern for AI and the journey. 

The patterns seen are for virtual agents, ambient intelligence, AI assisted professionals, knowledge mining and autonomous systems More details are discussed here.

The question of where to start is being asked in many circles and BI is still the foundation. Without good quality data there is no AI. The largest hurdle I think that needs to be overcome is data ingest quality.


Sunday, 13 January 2019