Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Tuesday 12 June 2018

Apache Calcite: A Foundational Framework

At ACM SIGMOD/PODS 2018, this week, Hortonworks are talking about Calcite.  A foundational framework for optimized query processing over heterogeneous data sources. It seems an interesting dynamic data management framework that omits some key functions: storage of data, algorithms to process data, and a repository for storing metadata.  

The main goal was to originally improve Apache Hive in three different axes: latency, scalability, and SQL support. Hive and Calcite are more integrated now and the new features for its optimizer aimed to generate better plans for query execution.

It can be used for data virtualization/federation. It supports heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is an attractive choice for adoption in big-data frameworks.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.