Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Thursday 17 October 2013

PASS Summit 2013 Keynote: Hekaton Why, What, and How

[I attended this presentation. The original links to material have been removed as these no longer exist]

The day 2 keynote was excellent from David DeWitt.  The main differences of Hekaton, In-Memory Databases is that it is memory optimized but durable, a very high performance OLTP engine, fully integrated into SQL Server 2014 and architected for modern CPUs. 

Other products on the market with in memory technology are Oracle TimesTen, IBM SolidDB and Volt VoltDB, but  SAP Hana is not a direct competitor.

This new technology is not the same as the old functionality of pinning a table in memory. In the current core engine there is the need for concurrency control, locks and latches and interpreted query plans.

In the 1980’s the database field explorer had the use of many different concurrency control (CC) mechanisms. The technique (due to Jim Gray) known as 2 phase locking emerged as the standard. There are a variety of lock types including X(eXclusive) and S(shared). There are two simple rules to follow. Before access, the query must acquire an appropriate lock types from the lock manager and once a query releases a lock, no further locks can be acquired.  Jim Gray won the ACM Turing Award for his work on transaction processing. 

A comparison between the two engines and the required 3 components.


   













In SQL Server 2014 it uses lock free data structures invented by Maurice Herlihy at Brown University, who was elected to National Academy of Engineering. The first phase is to create a memory optimised table. Then the concurrency control components are:

Optimistic
Conflicts are assumed to be rare
Transactions run to completion without blocking or setting locks
Conflicts are detected during a validation phase
Multiversion
Updating a row creates a new version of the row
Timestamps
Each row version has an associated time range
Transactions use their begin timestamps to select correct version
Timestamps  are also used to create a total order for transactions to obtain equivalent of a serial order. 

Query execution in Hekaton has 2 alternatives modes:

Regular T-SQL access ‘Interop’
Queries can access and update both Hekaton and disk resident tables
Interpreted execution limits performance
Complied Stored Procedures ‘Native
T-SQL => C code => machine code (dll)
MUCH faster, but some language surface restrictions with V1



 













The slide deck is no longer available from its original source.  

More details on the new SQL Server feature was in these articles listed below (links are now archived and not available so they have been removed):
Getting Started with SQL Server 2014 In-Memory OLTP
Hardware Considerations for In-Memory OLTP in SQL Server 2014
Architectural Overview of SQL Server 2014’s In-Memory OLTP Technology

Wednesday 16 October 2013

PhD Research Asynchronous Focus Group

I am working on the second stage of my data collection for my PhD and this requires the collection of qualitative data on database best practices and procedures and the complexities surrounding this. The specific question I am looking to answer is:

Is the adoption of best practices and procedures affected by the complex interactions that are an integral part of the management of database systems?

I have run some face to face focus groups but still need some further data and would like to obtain various views.

The 10 questions I have which I would like input on are :

1.Do you think some best practices and procedures are more important than others for managing database systems? If so, what are the most important ones?

2.What best practices and procedures do you think should be considered when selecting different database engines?

3.What kind of requirements gathering and architectural design processes for the hardware, data and databases do you think are important? Why are these important?

4.In what ways do you think that best practices and procedures could assist management of the database lifecycle?

5.What complexities between technology layers, do you think, affect the operation of databases?

6.Describe any complexities that exist with the adoption of best practices and procedures when managing cloud databases?

7.Was there ever a time when you felt the complexity of database systems compromised your ability to implement best practices and procedures?

8.Who you think should create and control database best practices and procedures?

9.How, if at all, do cross boundary communications among stakeholders affect best practices and procedures?

10.What effect can a database management strategic plan have on best practices and procedures for the management of database systems? 

I have created an online forum where discussions on these questions can task place. The forum is  http://sqltoolkit.freeforums.net/.  I have created a discussion thread for each question.

Please can you help and share your views.

More details on the research can be found at http://sqltoolkit.co.uk/focusgroup.htm

PASS Summit 2013 Keynote: Microsoft's Data Platform - The Road Ahead

The keynote entitled Microsoft's Data Platform - The Road Ahead  was delivered by Quentin Clark. The way forward is about delivering a complete data platform which is a continuous data solution. It is about a story of transformation from independent components being merged into one integrated whole solution. The analogy given was that of a department store.

The new in-memory architecture built into the platform runs on existing hardware and delivers breakthrough performance. The architectural changes on the platform increase performance for OLTP, Analytics and Excel.


Breakthroughs in availability and recovery mentioned the AlwaysOn secondaries in Windows Azure and backup to Windows Azure feature allowing for integrated hybrid products. The freely downloadable tool will support backups of all versions of SQL Server into Azure.

Encryption support for backups is built into SQL Server 2014 with the certificates stored on premises.

Data Files can be stored in Windows Azure. This addition means you can have a mix and match set of data files attached to a single database.

Querying all data components include the new feature Polybase in PDW 2.0, that combines structured and unstructured data. Simplicity is the aim with the Hadoop projects,  HDinsight and accelerated data warehousing provisioning.

Real time insights for everyone can be drawn from the tools Power Query, Power Pivot, Power View, Power Map, Power BI Q & A and Power BI sites.  The sharing of globally available information sets increases the understanding of data. The simplicity of data in Power BI  Q & A allows you to ask a question in natural language.

Concluding Quentin stated it is the excellence of all data that affects line of business and this will change how organizations compete. Big changes are coming in architecture design to build continuous platforms.

He announced the availability of SQL Server 2014 Community Technology Preview 2 (CTP 2)
Download
http://technet.microsoft.com/en-gb/evalcenter/dn205290.aspx
Release Notes
http://msdn.microsoft.com/en-us/library/dn169381%28v=sql.15%29.aspx
 

Sunday 13 October 2013

SQL PASS 2013 Keynotes

The keynotes along with some other sessions will be broadcast live from the PASS Summit 2013 on PASStv. These technical sessions will be streamed between 16-18 October.

The 2 keynotes are

Microsoft’s Data Platform – The Road Ahead
Quentin Clark (Corporate Vice President, Data Platform Group, Microsoft)
Wednesday, October 16, 8:15am - 10:00am (12:15pm - 2:00pm GMT)

Hekaton: Why, What, and How
David DeWitt (Technical Fellow, Microsoft Jim Gray Systems Lab, Madison)
Thursday, October 17, 8:15am - 10:00am (12:15pm - 2:00pm GMT)

PASStv
http://www.sqlpass.org/summit/2013/Sessions/MoreLearning/passtv.aspx
Keynote details
http://www.sqlpass.org/summit/2013/Sessions/Keynotes.aspx

Tuesday 1 October 2013

PASS SQLSaturday #228 Cambridge

I attended and helped at SQLSaturday Cambridge between 27-28 September 2013. The event was held at Queens College, Cambridge. This was a great location in the heart of Cambridge with the mathematical bridge for inspiration.


The pre-conference Training Day was excellent with Buck Woody. His session was all about learning to learn and providing the tools to do that.  A few key steps were suggested in the session to aid in learning a new topic. These are:-

•    Create a learning resource map, a self-syllabus
•    Create a mind map – a map of what to learn (nouns and verbs, references )
•    Create an ontology – an outline of a table of contents
•    Create a plan with objectives, with small achievable steps as maintaining learning for the duration of the study is difficult
•    Use a note taking  tool such as evernote / onenote
•    Create a lab and scenario
•    Develop a test

This quote sums learning up  "What you get by achieving your goals is not as important as what you become by achieving your goals." - Henry David Thoreau

We all have various different learning styles and understanding which type of learner you are between auditory, visual and kinaesthetic can help.  There are a couple of sites that have questionnaires
http://sunburst.usd.edu/~bwjames/tut/learning-style/ and  http://www.engr.ncsu.edu/learningstyles/ilsweb.html

There are many other tools and techniques that were mentioned in the session to assist with learning new technologies. The current Microsoft Certification levels are shared at  www.tinyurl.com/mgwuq8z

The Microsoft Virtual Labs are useful for immersion in a virtual hands-on lab and allow experience the product first hand.

The SQLSaturday Cambridge Community Day saw for the first time a shared day between SQL Server and SharePoint. An excellent combination as the use of SharePoint is still increasing and integrated heavily with SQL Server. 

The keynote was delivered by Buck Woody on what is big data. An interesting article mentioned during this was, What's Your Big Data Score?

There were many interesting sessions on Sharepoint, Data Explorer, Azure, PDW and Hadoop.  One very enjoyable and informative session was delivered by Carmel Gunn which showed the power of shared data sets using PowerPivot, Power Query, Power View and Power Map to gain insight on the Irish Economic Crisis.

I also had the opportunity to run several PhD Research Study Groups. Anyone interesting in participating further in my research can take part in the online focus group discussions on this forum http://sqltoolkit.freeforums.net/ . There are 10 questions on best practices and procedures.
More details on the research can be found at http://sqltoolkit.co.uk/focusgroup.htm