UNIT-3
Topic : Using Hive to query Hadoop files
Following classes are used by Hive to read and write HDFS files
•TextInputFormat/HiveIgnoreKeyTextOutputFormat: These 2 classes read/write data in plain
text file format.
•SequenceFileInputFormat/SequenceFileOutputFormat: These 2 classes read/write data in
hadoop SequenceFile format.
Where Does Hive Stores Data Files in HDFS?
Hive stores data at the HDFS location /user/hive/warehouse folder if not specified
a folder using the LOCATION clause while creating a table. Hive is a data
warehouse database for Hadoop, all database and table data files are stored at
HDFS location /user/hive/warehouse by default, you can also store the Hive data
warehouse files either in a custom location on HDFS, S3, or any other Hadoop
compatible file systems.
When you are working with Hive, you need to know about 2 different data stores.
• Hive Metastore
• Hive Data warehouse Location (Where Actual table data stored)