At ACM SIGMOD/PODS 2018, this week, Hortonworks are talking about Calcite. A foundational framework for optimized query processing over heterogeneous data sources. It seems an interesting dynamic data management framework that omits some key
functions: storage of data, algorithms to process data, and a repository for
storing metadata.
The main goal was to originally improve Apache Hive in three different axes: latency, scalability, and SQL support. Hive and Calcite are more integrated now and the new features for its optimizer aimed to generate better plans for query execution.
It can be used for data
virtualization/federation. It supports heterogeneous
data models and stores (relational, semi-structured, streaming, and
geospatial). This flexible, embeddable, and extensible architecture is an attractive choice for adoption in big-data frameworks.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.