funsung: HCatalog, tables and metadata for Hadoop

Sunday, May 18, 2014

HCatalog, tables and metadata for Hadoop

http://developer.yahoo.com/blogs/hadoop/hcatalog-tables-metadata-hadoop-5451.html
http://hive.apache.org/
https://cwiki.apache.org/confluence/display/Hive/HCatalog

Hadoop needs a better abstraction for data storage, and it needs a metadata service. HCatalog addresses both of these issues. It presents users with a table abstraction. This frees them from knowing where or how their data is stored. It allows data producers to change how they write data while still supporting existing data in the old format so that data consumers do not have to change their processes. It provides a shared schema and data model for Pig, Hive, and MapReduce. It will enable notifications of data availability. And it will provide a place to store state information about the data so that data cleaning and archiving tools can know which data sets are eligible for their services.

funsung

Sunday, May 18, 2014

HCatalog, tables and metadata for Hadoop

No comments:

Popular Posts

Verse of the Day

AD2