Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Monday 25 June 2018

Research Method and Robustness

During my research I had to review and assess many types of research methods, designs and analysis. The choices that were made aligned to the type of research questions under investigation. A key component was the consideration to provide robust research. The areas I reviewed are below.

My research followed a mixed method approach with a sequential explanatory design. The quantitative research using a survey as the data collection method and qualitative research using Focus Groups. As part of the qualitative analysis I used Thematic Analysis. I found using both methods complementary. To read more about my method see my thesis A Study into Best Practices and Procedures used in the Management of Database Systems. 

Wednesday 20 June 2018

Microsoft Inspire in 2018

Microsoft Inspire is the Microsoft partners conference next month. Microsoft personnel and industry experts from around the globe will come together for a week of networking and learning. It is a good conference to hear partners share their success stories and to see real life use cases for technology. Last year the keynotes were broadcast live, so I am hoping this will happen again this year.

Microsoft Ready and Microsoft Inspire will be co-located for the first time.

Tuesday 12 June 2018

Apache Calcite: A Foundational Framework

At ACM SIGMOD/PODS 2018, this week, Hortonworks are talking about Calcite.  A foundational framework for optimized query processing over heterogeneous data sources. It seems an interesting dynamic data management framework that omits some key functions: storage of data, algorithms to process data, and a repository for storing metadata.  

The main goal was to originally improve Apache Hive in three different axes: latency, scalability, and SQL support. Hive and Calcite are more integrated now and the new features for its optimizer aimed to generate better plans for query execution.

It can be used for data virtualization/federation. It supports heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is an attractive choice for adoption in big-data frameworks.

Sunday 10 June 2018

24 Hours of PASS Summit Preview 2018

Free Microsoft Data Platform training is to be held 12 June 2018 starting at 12:00 UTC. It is a great opportunity to hear some amazing sessions in 24 hours, of 1 hour back to back content.

Topics covered in this edition include Performance Tuning, Azure Data Lake, Digital Storytelling, Advanced R, Power BI and more! Join in on sessions from Kendra Little, Melissa Coates, Rob Sewell, Brent Ozar, Dejan Sarka, Devin Knight, Mico Yuk and many more experts in their fields.

Saturday 9 June 2018

ACM SIGMOD/PODS Conference 2018

It is that time of year again when the annual ACM SIGMOD/PODS Conference. To be held 10 -15 June 2018 in Houston. The ACM SIGMOD/PODS Conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences.

Microsoft has contributed to the programming of SIGMOD with several researchers serving on committees and inclusion in workshops, research sessions, industry sessions, demo sessions, and poster sessions.

Thursday 7 June 2018

Microsoft Graph - a useful tool

Microsoft Graph contains digital artifacts on life and work, and also ensures privacy and transparency. An overview of Microsoft Graph explains how it is a gateway to data and intelligence in Microsoft 365. 
The Microsoft Graph exposes APIs for:
  • Azure Active Directory
  • Office 365 services: SharePoint, OneDrive, Outlook/Exchange, Microsoft Teams, OneNote, Planner, and Excel
  • Enterprise Mobility and Security services: Identity Manager, Intune, Advanced Threat Analytics, and Advanced Threat Protection.
  • Windows 10 services: activities and devices
  • Education
The quick start page explain usage and provides examples.

Tuesday 5 June 2018

Splunk Structure

The Splunk infrastructure is made up of various components

Indexer – processes incoming machine data and stores the results in indexes for searching. Raw data is compressed and indexes point to the data
Search Head – takes the search request and distributes it to the indexes which search the data, then consolidates the results and displays them. Knowledge objects on the search head can be used to create additional fields and transform the data
Forwarder – consume data and forwards it to the indexers for processing
Deployment Server - distributes content and configurations
Cluster Master -  coordinates the replicating activities of the peer nodes and tells the search head where to find data
License Master – shares licenses with other nodes

The Folder Structure

The structure of the folders within Splunk is as follows:

Friday 1 June 2018

Azure Cloud Collaboration Center

In May Microsoft in Redmond shared the news of their Azure Cloud Collaboration Center. This facility is a first of its kind. The Cloud Collaboration Center space shows customers a snapshot of what is happening with their data 24/7 and enables real-time troubleshooting of issues by multiple teams simultaneously from across the organization.  It combines innovation and scale to address operational issues and unexpected events in order to drive new levels of customer responsiveness, security and efficiency. 

It is great to see a space for correlation of information with the possibility to pull the data up on individual workstations. I think this kind of collaboration is great news for customers.