Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Wednesday 7 November 2018

PASS Summit 2018 Keynote Day 1












The first keynote of PASS summit was delivered by Rohan Kumar entitled SQL Server and Azure Data Services: Harness the ultimate hybrid platform for data and AI





Customer priorities for a modernized data estate are: modernizing on-premises, modernizing to cloud, build cloud native apps and unlocking insights.






The announcements follow:

SQL Server 2019
SQL Server 2019 Public Preview  is a great way to celebrate the 25th anniversary of SQL Server

There is the introduction of big data clusters which combines Apache Spark and Hadoop into a single data platform called SQL Server. This combines the power of Spark with SQL Server over the relational and non-relation data sitting in SQL Server, HDFS and other systems like Oracle, Teradata, CosmosDB.

There are new capabilities around performance, availability and security for mission critical environments along with capability to leverage hardware innovations like persistent memory and enclaves.

Hadoop, ApacheSpark, Kubernetes and Java are native capabilities in the database engine.

Accelerated data recovery (ADR) was demonstrated and is incredible. It is at public preview.  The benefits of ADR are
  • Fast and consistent Database Recovery
  • Instantaneous Transaction rollback
  • Aggressive Log Truncation

Azure HDInsight 4.0

HDInsight 4.0 is now available in public preview.

There are several Apache Hadoop 3.0 innovations. Hive LLAP (Low Latency Analytical Processing known as Interactive Query in HDInsight) delivers ultra-fast SQL queries. The Performance metrics provide useful insight.

Integration with Power BI direct Query, Apache Zeppelin, and other tools. To learn more HDInsight Interactive Query with Power BI.

Data quality and GDPR compliance enabled by Apache Hive transactions
Improved ACID capabilities handle data quality (update/delete) issues at row level. This means that GDPR compliance requirements can now be meet with the ability to erase the data at row level. Spark can read and write to Hive ACID tables via Hive Warehouse Connector.

Apache Hive LLAP + Druid = single tool for multiple SQL use cases

Druid is a high-performance, column-oriented, distributed data store, which is well suited for user-facing analytic applications and real-time architectures. Druid is optimized for sub-second queries to slice-and-dice, drill down, search, filter, and aggregate event streams. Druid is commonly used to power interactive applications where sub-second performance with thousands of concurrent users are expected.

Hive Spark Integration
Apache Spark gets updatable tables and ACID transactions with Hive Warehouse Connector

There are several Apache Hadoop 3.0 innovations. Hive LLAP (Low Latency Analytical Processing called Interactive Query in HDInsight) for ultra-fast SQL queries. The Performance metrics provide useful insight.

Integration with Power BI Direct Query, Apache Zeppelin, and other tools. To learn more watch HDInsight Interactive Query with Power BI.

Better data quality and GDPR compliance enabled by Apache Hive transactions
Improved ACID capabilities handle data quality (update/delete) issues at row level. GDPR compliance requirements can now be meet with the ability to erase the data at row level. Spark can read and write to Hive ACID tables via Hive Warehouse Connector

Apache Hive LLAP + Druid = single tool for multiple SQL use cases

Druid is a high-performance, column-oriented, distributed data store, which is suited for user-facing analytic applications and real-time architectures. Druid is optimized for sub-second queries to slice-and-dice, drill down, search, filter, and aggregate event streams. Druid is commonly used to power interactive applications where sub-second performance with thousands of concurrent users are expected.

Hive Spark Integration
Apache Spark gets updatable tables and ACID transactions with Hive Warehouse Connector.



















Apache HBase and Apache Phoenix
Apache HBase 2.0 and Apache Phoenix 5.0 get new performance and stability features and all of the above have enterprise grade security.

Azure
Azure event hubs for Kafka is generally available
Azure Data Explorer is in public preview.

Azure Databricks Delta is in public preview
  • Connect data scientist and engineers
  • Prepare and clean data at massive scales
  • Build/train models with pre-configured ML

Azure Cosmos DB multi master replication was demoed with a drawing app, Azure Cosmos DB PxDraw
Azure SQL DB Managed Instances will be at General Availability (GA) on Dec 1st. This provides Availability Groups managed by Microsoft.

Power BI
















The new Dataflows is an enabler for self-service data prep in Power BI

Power BI Desktop November Update
  • Follow-up questions for Q&A explorerIt is possible to ask follow-up questions inside the Q&A explorer pop-up, which take into account the previous questions you asked.
  • Copy and paste between PBIX files
  • New modelling view makes it easier to work with large models.
  • Expand and collapse matrix row headers


No comments:

Post a Comment

Note: only a member of this blog may post a comment.