What is a data catalog? Gartner define this as
“A data catalog maintains an inventory of data assets through the discovery, description, and organization of datasets. The catalog provides context to enable data analysts, data scientists, data stewards, and other data consumers to find and understand a relevant dataset for the purpose of extracting business value.”
Data Catalogs are the New Black in Data Management and Analytics (Gartner, 2018)
A data catalog is important to have to record those critical assets that bring value to data. It becomes a library full of core information about your data sources. It can contain a data dictionary and can provide basic statistics about the data. This is a really useful feature being able to explore the data.
- Users can discover the data sources they need and understand the data sources they find. At the same time, Data Catalog help organizations get more value from their existing investments.
- They are inventories of data in the organization
- Data catalogs are a standard for metadata management in the age of big data and advanced analytics
- Adding tags to data sets enable a business glossary of terms to be applied to the data