Tuesday, 12 June 2018

Apache Calcite: A Foundational Framework


At ACM SIGMOD/PODS 2018, this week, Hortonworks are talking about Calcite.  A foundational framework for optimized query processing over heterogeneous data sources. It seems an interesting dynamic data management framework that omits some key functions: storage of data, algorithms to process data, and a repository for storing metadata.  


The main goal was to originally improve Apache Hive in three different axes: latency, scalability, and SQL support. Hive and Calcite are more integrated now and the new features for its optimizer aimed to generate better plans for query execution.

It can be used for data virtualization/federation. It supports heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is an attractive choice for adoption in big-data frameworks.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.