Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Friday, 14 September 2018

Azure Cosmos DB multi-model database

Azure Cosmos DB has to be one of my favorite databases due to the breadth of available database types, its choice of consistency models and elastic scale out.

An introduction can be read here.

A definition for each of these types of databases is given.

A key-value pair (KVP) is a set of two linked data items: a key, which is a unique identifier for some item of data, and the value, which is either the data that is identified or a pointer to the location of that data. Key-value pairs are frequently used in lookup tables, hash tables and configuration files.

A column-oriented DBMS (or columnar database management system) is a database management system (DBMS) that stores data tables by column rather than by row.

Document stores, also called document-oriented database systems, are characterized by their schema-free organization of data.That means records do not need to have a uniform structure, i.e. different records may have different columns. The types of the values ​​of individual columns can be different for each record. Columns can have more than one value (arrays). Records can have a nested structure. E.g. MongoDB

A graph database, also called a graph-oriented database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Every node in a graph database is defined by a unique identifier, a set of outgoing edges and/or incoming edges and a set of properties expressed as key/value pairs.

The five consistency levels offer predictable low latency guarantees and multiple well-defined relaxed consistency models.

Consistency Levels and guarantees

Consistency Level
Linearizability. Reads are guaranteed to return the most recent version of an item.
Bounded Staleness
Consistent Prefix. Reads lag behind writes by at most k prefixes or t interval
Consistent Prefix. Monotonic reads, monotonic writes, read-your-writes, write-follows-reads
Consistent Prefix
Updates returned are some prefix of all the updates, with no gaps
Out of order reads

There is a useful capacity planer that looks at request units throughput per second, request unit consumption and the amount of data storage needed by your application.

Thursday, 13 September 2018

Hortonworks Data Analytics Studio and Open Hybrid Architecture

Hortonworks has announced the general availability of Hortonworks Data Analytics Studio (DAS). A new service to enable enhanced productivity of business analysts by delivering faster insights from data at scale. DAS is part of the Hortonworks DataPlane Service (DPS). DPS enables businesses to discover, manage, govern and now optimize their data spread across hybrid environments. DAS leverages open-source technologies such as Apache Hive to share and extend the value of a modern data architecture in heterogeneous environments. It includes a useful database heat map.

Hortonworks have also shared the Open Hybrid Architecture Initiative, designed to enable big data workloads to run in a hybrid manner across on-premises, multi-cloud and edge architectures.

The Open Hybrid Architecture initiative will

  • De-coupling storage, with both file system interfaces and an object-store interface to data.
  • Containerizing compute resources for elasticity and software isolation.
  • Sharing services for metadata, governance and security across all tiers.
  • Providing DevOps/orchestration tools for managing services/workloads via the “infrastructure is code” paradigm to allow spin-up/down in a programmatic manner.
  • Designating workloads specific to use cases such as EDW, data science, rather than sharing everything in a multi-tenant Hadoop cluster.

AI the art of the possible

During SQL Saturday Cambridge I attended a session by Terry McCann on using AI to write a session submission to SQL Saturday. It was a great session and I would recommend you attending it if you get a chance.

In the new data world it is important to understand the difference between AI, Machine Learning and Deep Learning.

Then breaking this down further the differences between how machine learning works and deep learning is shown here. Deep learning is really a black box.

Image : https://www.upwork.com/hiring/for-clients/log-analytics-deep-learning-machine-learning/

Terry mentioned a free book to read to learn more Neural Networks and Deep Learning. 

He mentioned also the book Harry Potter and the Portrait of what Looked Like a Large Pile of Ash which was written by an AI bot.  I hadn't come across this before but it lets you see the art of the possible in the future.

Image: https://imgur.com/gallery/gkLFz

Tuesday, 11 September 2018

Azure DevOps Services

Microsoft have announced new Azure DevOps services than span the breadth of the development lifecycle to help developers ship software faster and with higher quality. Azure DevOps includes:

CI/CD that works with any language, platform, and cloud. Connect to GitHub or any Git repository and deploy continuously. 
Powerful work tracking with Kanban boards, backlogs, team dashboards, and custom reporting. 
Maven, npm, and NuGet package feeds from public and private sources. 
Unlimited cloud-hosted private Git repos for your project. Collaborative pull requests, advanced file management, and more. 
All in one planned and exploratory testing solution. 

PowerBI Report Server

Power BI Report Server is similar to SQL Server Reporting Services (SSRS). Power BI Report Server is on premises, and hosts paginated reports in addition to the Power BI features. There are several types of reports
  • Paginated (standard SSRS type reports)
  • Interactive (PowerBI Desktop)
  • Mobile
  • Analytical (Excel)

PowerBI Pro enables sharing and collaboration with the following features 
  • Build dashboards that deliver a 360-degree, real-time view of the business
  • Keep data up-to-date automatically, including on-premises sources
  • Collaborate on shared data
  • Audit and govern how data is accessed and used
  • Package content and distribute to users with app

There is increased functionality within PowerBI Pro. A comparison between the free PowerBI desktop edition and Pro version is in the table below. Only Pro users can publish content to app workspaces, consume apps without Premium capacity, share dashboards and subscribe to dashboards and reports. Free users can connect to all data sources using connectivity options such as DirectQuery, live connection and use the data gateway. 

Licensing Power BI Report Server

Power BI Report Server is available in two different license modes: 
A Power BI Premium license enables creation of a hybrid deployment (cloud and on-premises).


The session delivered by Isabelle Van Campenhoudt at SQL Saturday Cambridge shared this useful capacity planning  paper. It is based on the sample Power BI Report Server topology using virtual machines. 

Monday, 10 September 2018

Changing our concept of the world

In Ackoff’s Best His Classic Writings on Management (1999), he shares a way of thinking that as fascinated me for years. It can be used for problem solving and has helped me manage database systems for years. It is about looking at things in a holistic way.  This way of thinking, synthesis, often yields contradictory or conflicting results to traditional analytical thinking. Systems thinking, as it is known, helps expand your viewpoint considering the holistic nature of the environment. Ackoff shared the difference between the two modes of thinking: 

Machine Age Thinking (Analysis)
Synthesis (Systems Thinking)
Analysis precedes synthesis
Synthesis precedes analysis
Focuses on structure
Focuses on function
Reveals how things work
Reveals why things operate as they do
Yields knowledge
Yields understanding
Enables us to describe
Enables us to explain
Looks into things
Looks out of things
Iteration of the parts of the thing to be explained
Iteration of the parts of the thing to be explained, interactions between things in its environment and with its environment itself concerned with functional interaction of the parts of system
Break down large complex problems into solvable or manageable parts outputs assembled into a solution of the whole
Some of the best solution obtained from the parts taken separately is not the best solution of the whole.
The best performance of the whole can be reduced to the sum of the best performance of its parts taken separately
Asserts this is not possible
Reduces the focus of the investigator
Expands it focus of the investigator

As a result, I never look only at the immediate problem but consider all of the interconnecting parts.

Sunday, 9 September 2018

SQLSaturday Cambridge 2018 - Britannia Edition!

I attended PASS SQL Saturday Cambridge 8 September 2018. This was held at The Møller Centre, Churchill College, Storey's Way, Cambridge. Another great training day packed full of lots of new and exciting data technology.

I attended some amazing sessions. These were

  • Data Classification in SQL Server and Azure SQL Database - Mark Pryce-Maher
  • Azure Cosmos DB - What you need to know to build globally distrib - Satya SK Jayanty
  • Using AI to write session submission to SQLSaturday - Terry McCann
  • Power BI server and Office Online server, modernize your on-premises BI approach. - Isabelle Van Campenhoudt
  • Bricking it: An Introduction to Azure Databricks - Simon Whiteley 
  • SQLOpsStudio Vs SSMS - There can be only one - Warwick Rudd
  • Branding Yourself for a Dream Job - Steve Jones

There was some very interesting learning, that I took away, which I will write about in a follow up blog.

Saturday, 8 September 2018

SQL Relay: the travelling conference

Registration is open for SQL Relay. This is a conference with a difference. It tours round the UK for 5 consecutive days bringing international and MVP speakers to your local area. It is an amazing conference to attend with lots of sessions to choose from. This years events are at

Newcastle - Monday 8 October

Leeds - Tuesday 9 October

Birmingham - Wednesday 10 October

Reading - Thursday 11 October

Bristol - Friday 12 October