Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Thursday 28 February 2019

Azure Data Architecture Guide

There is a useful guide to read which discusses the a structured approach for designing data-centric solutions on Microsoft Azure. The two different approaches are

Traditional RDBMS workloads.
These designs are for online transaction processing (OLTP) and online analytical processing (OLAP).

Big data solutions. This design looks at big data architecture to handle the ingestion, processing, and analysis of data that is too large or complex for traditional database systems. 

There is useful pages to read on machine learning at scale and non relational data

Wednesday 27 February 2019

SQLBits 2019 The event

The first day of SQLBits 2019 at Manchester Central Convention Centre. What a great venue.

Monday 25 February 2019

Monday 11 February 2019

Data Trends for 2019

I created a survey question on Twitter to look at data trends. I was interested to see whether people felt that improving the quality of their data was more important than AI data ethics. Data quality is heavily influenced by data ingest so I added this as an option, as i felt it is often over looked, but is a foundation stone of good data quality. 

A few definitions:

Data Ethics describe a code of behaviour, specifically what is right and wrong, encompassing the following: Data Handling: generation, recording, curation, processing, dissemination, sharing, and use."  

Data Quality (DQ) as stated in the DAMA International, Data  Management Book of Knowledge  "Refers to both the characteristics associated with and to the processes used to measure or improve the quality of data.” Data is considered high quality to the degree it is fit for the purposes data consumers want to apply it."

Data ingestion is the process of obtaining and importing data  for immediate use or storage in a database. To ingest something is to "take something in or absorb something." Data can be streamed in real time or ingested in batches.”

Data ingestion tools provide a framework that allows companies to collect, import, load, transfer, integrate, and process data from a wide range of data sources.” 

The survey question had 267 votes.

What do you think will be the most important #Data trend for 2019 out of the following options?

In additions to the results above I received a few additional comments. 
  • Neither
  • The biggest thing in my opinion is just ethics. How is the data collected?
  • Also, what is it being used for. What are the impacts of high or low accuracy models.
  • Improving quality and ethics seem to me, to be related tasks
  • All of the above?

The results are quite interesting with AI Data Ethics and Improving Data Quality being the trends that the respondents thought were the most important.

Wednesday 6 February 2019

Improved Microsoft Docs

A cool image from http://www.thinksinc.org/ about Microsoft Docs.

I was looking at the Microsoft Docs pages and its new design. I have found it is much easier to navigate which speeds up searching.

At the top of the page there are 3 helpful options 

  • Download SQL Server
  • Get an Azure VM with SQL Server
  • Download SQL Server Management Studio

Then the Microsoft SQL Documentation has 3 categories covering on premises and cloud.
  • SQL Server on Windows
  • SQL as an Azure Service
  • SQL Server on Linux
There are technology areas to drill down further.

Then a further collection of links to enable a deeper dive into the technology.

  • Design
  • Tools
  • Reference
  • Reporting
  • Data Analytics
  • AI and Machine Learning
I was looking for design documentation and the link takes you to a page with easy to select image and text.