And there is my name.
Chaos, complexity, curiosity and database systems. A place where research meets industry
Welcome
Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP
"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein
"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein
Sunday, 30 September 2018
MVP Wall
At Microsoft Ignite 2018 Microsoft devoted an entire wall to list all the names of the MVPs. I felt very humbled to have my name on the MVP wall with so many amazing people. It is such a privilege to be a part of the Microsoft Data Community. #datafamily #MVPbuzz
And there is my name.
And there is my name.
Tuesday, 25 September 2018
SQLBits 2019
SQLBits 2019 has been announced. It is in the heart of Manchester. The last time it was in Manchester was in 2009. I am already excited about this next event.
Monday, 24 September 2018
Azure SQL Database Managed Instance GA
At Microsoft Ignite it was announced that Azure SQL Database Managed Instance will be general availability on October 1, 2018.


Azure
SQL Database Managed Instance is a deployment model of Azure SQL
Database. This service enables customers to migrate existing databases to
a fully managed PaaS cloud environment. It is possible to use the Data
Migration Service (DMS) in Azure to lift and shift their on-premises SQL Server. This
can be a useful tool to use for secure databases that reduces the
management overhead. including automatic patching and version updates, automated
backups and high-availability.
Reading
Azure SQL Database Managed Instance, General Purpose tier general availability
Azure Database Migration Service and tool updates – Ignite 2018
Reading
Azure SQL Database Managed Instance, General Purpose tier general availability
Azure Database Migration Service and tool updates – Ignite 2018
SQL Server 2019, Big Data and AI
At Microsoft Ignite SQL Server 2019 was launched. An amazing product for the future combining SQL Server 2019 with big data and analytics. It is great to see the combining of multiple tools in once place, a one stop shop for large and small data, structured and unstructured and from multiple sources.
There are 3 major components
to SQL Server 2019.
The creation of a data
virtualization layer that handles complexity of all data sources and format. Enabling the integration of structured and unstructured
data without moving the data.
The streamlining of data management
with SQL Server 2019 big data clusters deployed in Kubernetes integrating HDFS
and Spark. The architecture is explained in more depth here and looks like
The creation of a complete AI
platform that can use Spark to analyse both structured and unstructured data
anywhere, use SQL Server machine learning services and SparkML.
In summary SQL Server big data clusters allow you to deploy scalable clusters of SQL Server, Spark, and HDFS Docker containers running on Kubernetes.
Read More
Sunday, 23 September 2018
Microsoft Ignite - watch live
Microsoft Ignite is happening this week. Unfortunately I won't be there but the keynote and some deep dive sessions will be streamed live. I am looking forward to seeing what Microsoft CEO Satya Nadella shares as his vision for the future of tech. I will be interested to see what other tools and technologies will play an important part in the next year and be excited to see how data fits into this forthcoming vision.
To watch the live stream, the meeting invite is for
Start Time: 09:00 - 17:15 (UK time 14:00)
Date: Monday 24 September 2018
Time Zone: Eastern Time (US and Canada)
Friday, 14 September 2018
Azure Cosmos DB multi-model database
Azure Cosmos DB has to be one of my favorite databases due to the breadth of available database types, its choice of consistency models and elastic scale out.
An introduction can be read here.
A definition for each of these types of databases is given.
Key-value
A key-value pair (KVP) is a set of two linked data items: a key, which is a unique identifier for some item of data, and the value, which is either the data that is identified or a pointer to the location of that data. Key-value pairs are frequently used in lookup tables, hash tables and configuration files.
https://searchenterprisedesktop.techtarget.com/definition/key-value-pair
Column
A column-oriented DBMS (or columnar database management system) is a database management system (DBMS) that stores data tables by column rather than by row.
https://en.wikipedia.org/wiki/Column-oriented_DBMS
Document
Document stores, also called document-oriented database systems, are characterized by their schema-free organization of data.That means records do not need to have a uniform structure, i.e. different records may have different columns. The types of the values of individual columns can be different for each record. Columns can have more than one value (arrays). Records can have a nested structure. E.g. MongoDB
https://db-engines.com/en/article/Document+Stores
Graph
A graph database, also called a graph-oriented database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Every node in a graph database is defined by a unique identifier, a set of outgoing edges and/or incoming edges and a set of properties expressed as key/value pairs.
https://whatis.techtarget.com/definition/graph-database
There is a useful capacity planer that looks at request units throughput per second, request unit consumption and the amount of data storage needed by your application.
A definition for each of these types of databases is given.
Key-value
A key-value pair (KVP) is a set of two linked data items: a key, which is a unique identifier for some item of data, and the value, which is either the data that is identified or a pointer to the location of that data. Key-value pairs are frequently used in lookup tables, hash tables and configuration files.
https://searchenterprisedesktop.techtarget.com/definition/key-value-pair
Column
A column-oriented DBMS (or columnar database management system) is a database management system (DBMS) that stores data tables by column rather than by row.
https://en.wikipedia.org/wiki/Column-oriented_DBMS
Document
Document stores, also called document-oriented database systems, are characterized by their schema-free organization of data.That means records do not need to have a uniform structure, i.e. different records may have different columns. The types of the values of individual columns can be different for each record. Columns can have more than one value (arrays). Records can have a nested structure. E.g. MongoDB
https://db-engines.com/en/article/Document+Stores
Graph
A graph database, also called a graph-oriented database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Every node in a graph database is defined by a unique identifier, a set of outgoing edges and/or incoming edges and a set of properties expressed as key/value pairs.
https://whatis.techtarget.com/definition/graph-database
The five consistency
levels offer predictable low latency
guarantees and multiple well-defined relaxed consistency models.
Consistency Levels and guarantees
Consistency Level
|
Guarantees
|
Strong
|
Linearizability.
Reads are guaranteed to return the most recent version of an item.
|
Bounded
Staleness
|
Consistent
Prefix. Reads lag behind writes by at most k prefixes or t interval
|
Session
|
Consistent
Prefix. Monotonic reads, monotonic writes, read-your-writes,
write-follows-reads
|
Consistent
Prefix
|
Updates
returned are some prefix of all the updates, with no gaps
|
Eventual
|
Out of
order reads
|
There is a useful capacity planer that looks at request units throughput per second, request unit consumption and the amount of data storage needed by your application.
Thursday, 13 September 2018
Hortonworks Data Analytics Studio and Open Hybrid Architecture
Hortonworks has announced the general availability of Hortonworks Data Analytics Studio (DAS). A new service to enable enhanced productivity of business analysts by delivering faster insights from data at scale. DAS is part of the Hortonworks DataPlane Service (DPS). DPS enables businesses to discover, manage, govern and now optimize their data spread across hybrid environments. DAS leverages open-source technologies such as Apache Hive to share and extend the value of a modern data architecture in heterogeneous environments. It includes a useful database heat map.
Hortonworks have also shared the Open Hybrid Architecture Initiative, designed to enable big data workloads to run in a hybrid manner across on-premises, multi-cloud and edge architectures.
The Open Hybrid Architecture initiative will
Hortonworks have also shared the Open Hybrid Architecture Initiative, designed to enable big data workloads to run in a hybrid manner across on-premises, multi-cloud and edge architectures.
The Open Hybrid Architecture initiative will
- De-coupling storage, with both file system interfaces and an object-store interface to data.
- Containerizing compute resources for elasticity and software isolation.
- Sharing services for metadata, governance and security across all tiers.
- Providing DevOps/orchestration tools for managing services/workloads via the “infrastructure is code” paradigm to allow spin-up/down in a programmatic manner.
- Designating workloads specific to use cases such as EDW, data science, rather than sharing everything in a multi-tenant Hadoop cluster.
Subscribe to:
Posts (Atom)