Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein

Monday 4 November 2019

HDFS tiering in SQL Server Big Data Clusters

SQL Server Big data clusters has its own local HDFS built-in data lake to enable the storing of unstructured data and high volume data.  This data virtualization capability has a feature called HDFS tiering. It is a major new contribution to the Apache Hadoop project. 

With HDFS tiering you can access other data lakes by mounting the remote HDFS/S3 compatible data source to your local HDFS data lake. Access is seamlessly available from SQL Server or Apache Spark. Currently you can mount the following storage: Azure Data Lake Storage Gen2, AWS S3, Isilon, StorageGRID and Flashblase.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.